
Data fuels every corner of the modern internet, from social networks and AI research to marketing analytics and political insights. Yet, access to large-scale, high-quality data has long been a privilege reserved for well-funded Web2 corporations.
Thatβs what makes Data Universe (TAOβs Subnet 13) such a breakthrough within the Bittensor ecosystem.
In a recent Bittensor Brief (see video below), Mark Jeffrey explored how this subnet transforms the once-exclusive world of social data scraping into an open, decentralized marketplace.Β
Built to crawl platforms like X (Formerly Twitter), Reddit, and YouTube, Data Universe collects billions of data points every week and makes them accessible for a fraction of traditional costs.
Itβs not just an innovation in data collection, itβs a reimagining of who gets to own and use information in the AI era.
The Backbone for Other Subnets
βHereβs the cool part,β Mark explains, βSubnet 13 doesnβt just do its own thing. It actually powers a ton of other subnets inside Bittensor.β
He brings up ReadyAI (Subnet 33) as an example. ReadyAI doesnβt scrape data itself. It uses Data Universe to fetch the raw material (social posts, conversations, threads) and then adds metadata, cleaning it up for model training. βThey basically take what Data Universe brings in and refine it,β Mark says.
Thatβs the beauty of composability inside Bittensor. One subnetβs output becomes anotherβs input. In this case, Data Universe has become the data infrastructure layer for a growing number of AI-driven projects.
Scale Thatβs Hard to Believe
βOver 55 billion rows of social data have already been collected by Data Universe,β Mark points out. βAnd itβs growing by about 1.1 billion records per week.β
Those records come from X (Twitter), Reddit, and YouTube, which together make up roughly three-quarters of all online social conversations. The system even updates every 30 seconds. βItβs live data, not something thatβs months old,β Mark says. βThatβs what makes it powerful.β
For anyone building or researching in AI, real-time context is gold, and thatβs exactly what Subnet 13 delivers.
Meet Gravity: The AI-Powered Scraper
Mark then dives into what makes Subnet 13 tick: a tool called Gravity.
βThis isnβt your ordinary scraper,β he says. βItβs LLM-powered. You can tell it what you want (say you want to scrape Reddit for the word βBittensorβ or X for β$TAOβ) and Gravity builds that dataset for you in minutes.β
In his walkthrough, Mark actually creates a new scrape task himself, naming it Bittensor, selecting date ranges, adding keywords, and launching it. βOnce I hit go,β he explains, βthe miners on Subnet 13 get the job, and they start gathering the data for me. I just sit back and watch.β
When the scrape finishes, the output is a downloadable CSV file; complete with sentiment, keywords, and timestamps. βThatβs the magic,β Mark smiles. βDecentralized machines out there doing real, valuable work for pennies.β
The Marketplace: A Walmart for Data
βNow, maybe you donβt want to create your own scrape,β Mark says. βThatβs fine, because Data Universe already has a marketplace of pre-built datasets.β
From iPhone 17 launch sentiment to the New York City mayoral race, you can browse thousands of cleaned, ready-to-download data sets, all structured and tagged by topic. βTheyβve already figured out what people want,β Mark says. βItβs like walking into a data Walmart β grab what you need, pay a few bucks, and go.β
Compared to Web2 providers that charge hundreds or thousands of dollars for access, these datasets are priced between a few cents and a few dollars. βHonestly,β Mark adds, βitβs orders of magnitude cheaper than the competition.β
Visualizing the Data
For those who like a more hands-on view, Mark highlights the Nebula 3D Interface: a tool for visualizing clusters, trends, and top contributors across networks.
βYou can literally see sentiment spreading,β he says. βAnd the sentiment analysis dashboard lets you understand emotion and tone at a glance. Perfect for researchers, marketers, or just anyone tracking what people are talking about online.β
Partnerships Across the Network
Subnet 13βs impact doesnβt stop at scraping. Itβs now feeding intelligence into multiple parts of Bittensorβs growing ecosystem.
Mark lists a few:
a. Score (Subnet 44) uses Data Universe to link sports conversations with match outcomes.
b. Gaia (Subnet 57) integrates it to humanize weather reporting.
c. Savant on tao.app trains its conversational AI using datasets from Data Universe.
βYouβre seeing real composability here,β Mark explains. βOne subnet stacks on another. Itβs decentralized intelligence at work.β
Real Use Cases That Matter
Markβs enthusiasm grows as he explains how Subnet 13 can be used beyond crypto.
βAI developers, academics, journalists, marketers β anyone who needs live social data can use this,β he says. βYou can track sentiment, train models, even monitor public opinion in real time.β
He cites examples ranging from behavioral research and political analysis to AI model fine-tuning. βItβs all the same raw material: clean, structured human data.β
Looking Ahead
Macrocosmos, the team behind Subnet 13, isnβt stopping there. Theyβre expanding to academic archives, web sources, and other large datasets, aiming to make Data Universe the foundational layer for AI-ready open data across all of Bittensor.
βI think theyβve already nailed it,β Mark says. βTheyβve built something that every other subnet wants to plug into. Itβs the heartbeat of decentralized data.β
Closing Thoughts
As the episode wraps, Mark sums it up with his usual mix of admiration and humor.
βSubnet 13, Data Universe. I love it, and so should you,β he laughs. βGo use it. Itβs data done right: open, affordable, and alive.β
In just a few years, what started as an experimental scraping subnet has evolved into one of Bittensorβs most vital systems: proof that in the decentralized age, intelligence doesnβt just live on servers. It lives in the network itself.

Be the first to comment