Mark Jeffrey Explores How Subnet 13 Turns Social Data into Open Intelligence

Mark Jeffrey Explores How Subnet 13 Turns Social Data into Open Intelligence
Listen to this article
Read Time:5 Minute, 12 Second

Data fuels every corner of the modern internet, from social networks and AI research to marketing analytics and political insights. Yet, access to large-scale, high-quality data has long been a privilege reserved for well-funded Web2 corporations.

That’s what makes Data Universe (TAO’s Subnet 13) such a breakthrough within the Bittensor ecosystem.

In a recent Bittensor Brief (see video below), Mark Jeffrey explored how this subnet transforms the once-exclusive world of social data scraping into an open, decentralized marketplace

Built to crawl platforms like X (Formerly Twitter), Reddit, and YouTube, Data Universe collects billions of data points every week and makes them accessible for a fraction of traditional costs

It’s not just an innovation in data collection, it’s a reimagining of who gets to own and use information in the AI era.

The Backbone for Other Subnets

“Here’s the cool part,” Mark explains, “Subnet 13 doesn’t just do its own thing. It actually powers a ton of other subnets inside Bittensor.”

He brings up ReadyAI (Subnet 33) as an example. ReadyAI doesn’t scrape data itself. It uses Data Universe to fetch the raw material (social posts, conversations, threads) and then adds metadata, cleaning it up for model training. “They basically take what Data Universe brings in and refine it,” Mark says.

That’s the beauty of composability inside Bittensor. One subnet’s output becomes another’s input. In this case, Data Universe has become the data infrastructure layer for a growing number of AI-driven projects.

Scale That’s Hard to Believe

“Over 55 billion rows of social data have already been collected by Data Universe,” Mark points out. “And it’s growing by about 1.1 billion records per week.”

Those records come from X (Twitter), Reddit, and YouTube, which together make up roughly three-quarters of all online social conversations. The system even updates every 30 seconds. “It’s live data, not something that’s months old,” Mark says. “That’s what makes it powerful.”

For anyone building or researching in AI, real-time context is gold, and that’s exactly what Subnet 13 delivers.

Meet Gravity: The AI-Powered Scraper

Mark then dives into what makes Subnet 13 tick: a tool called Gravity.

“This isn’t your ordinary scraper,” he says. “It’s LLM-powered. You can tell it what you want (say you want to scrape Reddit for the word ‘Bittensor’ or X for ‘$TAO’) and Gravity builds that dataset for you in minutes.”

In his walkthrough, Mark actually creates a new scrape task himself, naming it Bittensor, selecting date ranges, adding keywords, and launching it. “Once I hit go,” he explains, “the miners on Subnet 13 get the job, and they start gathering the data for me. I just sit back and watch.”

When the scrape finishes, the output is a downloadable CSV file; complete with sentiment, keywords, and timestamps. “That’s the magic,” Mark smiles. “Decentralized machines out there doing real, valuable work for pennies.”

The Marketplace: A Walmart for Data

“Now, maybe you don’t want to create your own scrape,” Mark says. “That’s fine, because Data Universe already has a marketplace of pre-built datasets.”

From iPhone 17 launch sentiment to the New York City mayoral race, you can browse thousands of cleaned, ready-to-download data sets, all structured and tagged by topic. “They’ve already figured out what people want,” Mark says. “It’s like walking into a data Walmart — grab what you need, pay a few bucks, and go.”

Compared to Web2 providers that charge hundreds or thousands of dollars for access, these datasets are priced between a few cents and a few dollars. “Honestly,” Mark adds, “it’s orders of magnitude cheaper than the competition.”

Visualizing the Data

For those who like a more hands-on view, Mark highlights the Nebula 3D Interface: a tool for visualizing clusters, trends, and top contributors across networks.

“You can literally see sentiment spreading,” he says. “And the sentiment analysis dashboard lets you understand emotion and tone at a glance. Perfect for researchers, marketers, or just anyone tracking what people are talking about online.”

Partnerships Across the Network

Subnet 13’s impact doesn’t stop at scraping. It’s now feeding intelligence into multiple parts of Bittensor’s growing ecosystem.

Mark lists a few:

a. Score (Subnet 44) uses Data Universe to link sports conversations with match outcomes.

b. Gaia (Subnet 57) integrates it to humanize weather reporting.

c. Savant on tao.app trains its conversational AI using datasets from Data Universe.

“You’re seeing real composability here,” Mark explains. “One subnet stacks on another. It’s decentralized intelligence at work.”

Real Use Cases That Matter

Mark’s enthusiasm grows as he explains how Subnet 13 can be used beyond crypto.

“AI developers, academics, journalists, marketers — anyone who needs live social data can use this,” he says. “You can track sentiment, train models, even monitor public opinion in real time.”

He cites examples ranging from behavioral research and political analysis to AI model fine-tuning. “It’s all the same raw material: clean, structured human data.”

Looking Ahead

Macrocosmos, the team behind Subnet 13, isn’t stopping there. They’re expanding to academic archives, web sources, and other large datasets, aiming to make Data Universe the foundational layer for AI-ready open data across all of Bittensor.

“I think they’ve already nailed it,” Mark says. “They’ve built something that every other subnet wants to plug into. It’s the heartbeat of decentralized data.”

Closing Thoughts

As the episode wraps, Mark sums it up with his usual mix of admiration and humor.

Subnet 13, Data Universe. I love it, and so should you,” he laughs. “Go use it. It’s data done right: open, affordable, and alive.

In just a few years, what started as an experimental scraping subnet has evolved into one of Bittensor’s most vital systems: proof that in the decentralized age, intelligence doesn’t just live on servers. It lives in the network itself.

Subscribe to receive The Tao daily content in your inbox.

We don’t spam! Read our privacy policy for more info.

Be the first to comment

Leave a Reply

Your email address will not be published.


*