Mark Jeffrey Explores How Subnet 13 Turns Social Data into Open Intelligence

Listen to this article

Read Time:5 Minute, 12 Second

Data fuels every corner of the modern internet, from social networks and AI research to marketing analytics and political insights. Yet, access to large-scale, high-quality data has long been a privilege reserved for well-funded Web2 corporations.

That’s what makes Data Universe (TAO’s Subnet 13) such a breakthrough within the Bittensor ecosystem.

In a recent Bittensor Brief (see video below), Mark Jeffrey explored how this subnet transforms the once-exclusive world of social data scraping into an open, decentralized marketplace.

Built to crawl platforms like X (Formerly Twitter), Reddit, and YouTube, Data Universe collects billions of data points every week and makes them accessible for a fraction of traditional costs.

It’s not just an innovation in data collection, it’s a reimagining of who gets to own and use information in the AI era.

Table of Contents

The Backbone for Other Subnets

“Here’s the cool part,” Mark explains, “Subnet 13 doesn’t just do its own thing. It actually powers a ton of other subnets inside Bittensor.”

He brings up ReadyAI (Subnet 33) as an example. ReadyAI doesn’t scrape data itself. It uses Data Universe to fetch the raw material (social posts, conversations, threads) and then adds metadata, cleaning it up for model training. “They basically take what Data Universe brings in and refine it,” Mark says.

That’s the beauty of composability inside Bittensor. One subnet’s output becomes another’s input. In this case, Data Universe has become the data infrastructure layer for a growing number of AI-driven projects.

Scale That’s Hard to Believe

“Over 55 billion rows of social data have already been collected by Data Universe,” Mark points out. “And it’s growing by about 1.1 billion records per week.”

Those records come from X (Twitter), Reddit, and YouTube, which together make up roughly three-quarters of all online social conversations. The system even updates every 30 seconds. “It’s live data, not something that’s months old,” Mark says. “That’s what makes it powerful.”

For anyone building or researching in AI, real-time context is gold, and that’s exactly what Subnet 13 delivers.

Meet Gravity: The AI-Powered Scraper

Coming soon to #SN13 – the Gravity marketplace

Search through previously scraped datasets, and pick the ones most relevant to you.

Filter by topic, source, and size.

Each dataset in the marketplace comes with stats on sentiment over time, topic popularity, interest by region,… pic.twitter.com/n7s6Dn1Sth
— Data Universe ・ SN13 (@Data_SN13) October 20, 2025

Mark then dives into what makes Subnet 13 tick: a tool called Gravity.

“This isn’t your ordinary scraper,” he says. “It’s LLM-powered. You can tell it what you want (say you want to scrape Reddit for the word ‘Bittensor’ or X for ‘$TAO’) and Gravity builds that dataset for you in minutes.”

In his walkthrough, Mark actually creates a new scrape task himself, naming it Bittensor, selecting date ranges, adding keywords, and launching it. “Once I hit go,” he explains, “the miners on Subnet 13 get the job, and they start gathering the data for me. I just sit back and watch.”

When the scrape finishes, the output is a downloadable CSV file; complete with sentiment, keywords, and timestamps. “That’s the magic,” Mark smiles. “Decentralized machines out there doing real, valuable work for pennies.”

The Marketplace: A Walmart for Data

“Now, maybe you don’t want to create your own scrape,” Mark says. “That’s fine, because Data Universe already has a marketplace of pre-built datasets.”

From iPhone 17 launch sentiment to the New York City mayoral race, you can browse thousands of cleaned, ready-to-download data sets, all structured and tagged by topic. “They’ve already figured out what people want,” Mark says. “It’s like walking into a data Walmart — grab what you need, pay a few bucks, and go.”

Compared to Web2 providers that charge hundreds or thousands of dollars for access, these datasets are priced between a few cents and a few dollars. “Honestly,” Mark adds, “it’s orders of magnitude cheaper than the competition.”

Visualizing the Data

🚀 Get real-time data from X and Reddit with Gravity, powered by #SN13 Data Universe 🌐 with Discord data visualisations via Nebula 🌌

Either add hashtags and subreddits yourself, or ask our chat interface Mission Commander to do it for you – powered by subnet 1, Apex 💬

Check… pic.twitter.com/3qqwqelUrh
— Macrocosmos (@MacrocosmosAI) March 14, 2025

For those who like a more hands-on view, Mark highlights the Nebula 3D Interface: a tool for visualizing clusters, trends, and top contributors across networks.

“You can literally see sentiment spreading,” he says. “And the sentiment analysis dashboard lets you understand emotion and tone at a glance. Perfect for researchers, marketers, or just anyone tracking what people are talking about online.”

Partnerships Across the Network

Subnet 13’s impact doesn’t stop at scraping. It’s now feeding intelligence into multiple parts of Bittensor’s growing ecosystem.

Mark lists a few:

a. Score (Subnet 44) uses Data Universe to link sports conversations with match outcomes.

Result ✅ SN13 🤝 SN44

Subnet 13, Data-Universe, is collaborating with subnet 44, Score ⚽ @webuildscore

Data Universe will power Score with soccer sentiment analysis in real time, enabling #SN44 to correlate in-game metrics with fan reactions – contextualising match statistics… pic.twitter.com/brUUkK4Thk
— Macrocosmos (@MacrocosmosAI) March 6, 2025

b. Gaia (Subnet 57) integrates it to humanize weather reporting.

🌍 SN13 Data Universe 🤝 SN57 Gaia 🌍#SN57, @Gaia_AI_ is convening miners to predict global weather patterns – faster, cheaper, and more accurately than conventional methods.

And #SN13 Data Universe is now providing social media data to further enrich their models.

Starting… pic.twitter.com/35662sJItn
— Macrocosmos (@MacrocosmosAI) May 19, 2025

c. Savant on tao.app trains its conversational AI using datasets from Data Universe.

“You’re seeing real composability here,” Mark explains. “One subnet stacks on another. It’s decentralized intelligence at work.”

Real Use Cases That Matter

Mark’s enthusiasm grows as he explains how Subnet 13 can be used beyond crypto.

“AI developers, academics, journalists, marketers — anyone who needs live social data can use this,” he says. “You can track sentiment, train models, even monitor public opinion in real time.”

He cites examples ranging from behavioral research and political analysis to AI model fine-tuning. “It’s all the same raw material: clean, structured human data.”

Looking Ahead

Macrocosmos, the team behind Subnet 13, isn’t stopping there. They’re expanding to academic archives, web sources, and other large datasets, aiming to make Data Universe the foundational layer for AI-ready open data across all of Bittensor.

“I think they’ve already nailed it,” Mark says. “They’ve built something that every other subnet wants to plug into. It’s the heartbeat of decentralized data.”

Closing Thoughts

As the episode wraps, Mark sums it up with his usual mix of admiration and humor.

“Subnet 13, Data Universe. I love it, and so should you,” he laughs. “Go use it. It’s data done right: open, affordable, and alive.”

In just a few years, what started as an experimental scraping subnet has evolved into one of Bittensor’s most vital systems: proof that in the decentralized age, intelligence doesn’t just live on servers. It lives in the network itself.