
If you asked most people in the AI industry to name the leading open-source inference provider on the planet, they would probably guess a well-funded Silicon Valley startup backed by hundreds of millions in venture capital. They would be wrong. The top spot, measured by daily token volume on Open Router, the largest inference marketplace on the internet, currently belongs to Chutes AI, a decentralized platform running as Subnet 64 on Bittensor, built by a team small enough to fit around a single dinner table.

Florian Stanaringer, an IT consultant and Backend dev at Chutes, laid out the full picture in a recent talk delivered to a live audience in Munich. His presentation traced the project’s technical architecture, its improbable rise to market leadership, and the deeper ideological bet that drives it: that AI inference should be a common good, not a luxury controlled by trillion-dollar corporations.
The Problem With Centralized AI Compute
Florian opened with a framing question. If artificial intelligence becomes as essential as water, electricity, or the internet, then who should control it? The current answer, he argued, is deeply unsatisfying.
The biggest players in the space, OpenAI, Meta, and the European Union, are spending hundreds of billions of dollars building massive centralized data centers, some with dedicated power plants, to meet anticipated demand. AI-focused startups, flush with venture capital, are subsidizing inference at below-cost prices to capture market share, making it nearly impossible for smaller competitors to match their pricing on a level playing field.

The result is an industry where access to AI compute is increasingly concentrated in the hands of a few extraordinarily well-capitalized entities. For anyone who believes AI will become essential infrastructure, that concentration poses a serious long-term risk.
Bitcoin’s Blueprint, And Its Limits
Florian’s key conceptual move was to point at the one system that has already solved the problem of building global-scale compute infrastructure without any central coordination: Bitcoin.
Bitcoin mining, he noted, currently consumes more computational resources than the entire AI industry, a claim that sounds strange but supported with conservative estimates. No one told miners to build data centers. No government subsidized their hardware purchases. The incentive mechanism alone was enough to produce a planetary-scale network of specialized compute.
That model offers several properties directly applicable to the vision of decentralized AI inference: the self-interest of individual participants aligns naturally with the goals of the system as a whole; participation is permissionless; and the network scales without central coordination.
But Bitcoin’s design also has limitations that matter for AI. The computational work miners perform doesn’t produce a useful commodity beyond securing the ledger. And Bitcoin’s consensus mechanism has no way to evaluate the quality of a service, it can’t tell you whether an AI model was served correctly, quickly, or reliably.

Bittensor was designed to address exactly those gaps. He cited co-founder Jacob Steves’s vision of a network where intelligence itself is the commodity, owned by no one and accessible to everyone. The key innovation is the addition of validators who score miners’ work through a hardcoded incentive mechanism, a system that can quantify not just whether work was done, but how well it was done.
How Chutes Works
Chutes AI is Subnet 64 on the Bittensor network. Its function is straightforward: it serves open-source AI models (DeepSeek R1, Meta’s Llama family, Alibaba’s Qwen series, Mistral, and many others) to users and applications via an inference API.
The architecture is entirely decentralized. Chutes does not own data centers. Instead, GPU operators from around the world register as miners and contribute their hardware to serve models. Validators on the network continuously evaluate miners across multiple dimensions: uptime, speed, reliability, the variety of models they serve, and the cost-efficiency of their hardware. The incentive mechanism translates these scores into emissions, miners who perform well earn TAO; miners who underperform get pushed to the bottom of the rankings and eventually deregistered.

Florian emphasized the Darwinian intensity of this competition. Unlike a traditional company where an infrastructure team has a fixed budget and limited internal pressure, Chutes miners are in direct, continuous competition with one another. The best performers capture a larger share of emissions. The worst get replaced. When the team updates the incentive mechanism’s source code, which is open for anyone to inspect or propose changes to, miners adjust their output within minutes, optimizing relentlessly against whatever the new parameters reward.
The system also includes bounty mechanics. When a new open-source model is released and the community wants it available on Chutes, miners race to be the first to serve it. The hype cycle in AI moves fast, and the incentive structure rewards speed of adoption.
For end users, the experience is simple. Chutes offers a chat interface on its website, an API with code snippets for each hosted model, and, remarkably, some models served entirely for free, despite the enormous cost of the underlying hardware.

Number One Despite the Handicap
The most striking data point in Florian’s talk was Chutes’ current market position. On Open Router, the largest AI inference marketplace, Chutes is the number-one provider by daily token volume among open-source inference services, processing billions of tokens per day.

The Chutes team has implemented trusted execution environments (TEEs), cryptographic guarantees that no one, including the machine’s owner, can observe or tamper with the data being processed. TEEs opens the door to integration with other inference marketplaces.
The Economics of Spare Cycles
A recurring question from the audience was how Chutes can offer inference at prices dramatically below centralized competitors. Florian’s answer came down to opportunity cost.
The network attracts GPU operators across a range of scales, from individual enthusiasts to data centers with thousands of high-end cards. For the data centers, the math is particularly compelling. A facility with a thousand H200 GPUs that only has 900 rented out at any given time can put the remaining 100 to work on Chutes. The alternative is earning nothing. At an opportunity cost of zero, even modest Chutes emissions represent pure upside. This dynamic means the network naturally aggregates the world’s idle compute capacity at prices that centralized providers, burdened by fixed infrastructure costs, cannot match.
At the same time, the competitive dynamics between miners keep prices efficient. No one can coast on inflated margins because another miner will simply undercut them and capture a larger share of emissions. The result is what Florian described as a brutally efficient market:good for consumers, demanding for miners.
Currently, about 50 different miners contribute compute to Chutes, running thousands of GPU instances that include top-tier hardware like NVIDIA H200 and B200 cards. The network’s revenue, while still early, is on a clear upward trajectory since Chutes began charging for inference.

A $100 Million Anomaly
Florian briefly addressed the investment dimension, framing it not as financial advice but as an observation about comparative valuation.

Chutes’ current market capitalization, roughly $110 million via its subnet alpha token, is orders of magnitude below competitors serving comparable or smaller inference volumes. Several centralized inference providers are valued in the tens of billions (ordinary people do not have access to invest in most of them in the early stages). Yet Chutes is leading its peer group by volume on the largest open marketplace (Open Router), and is fully investable by anyone through $TAO.
The Long Game: Community Ownership and Green Compute
Beyond the immediate competitive picture, Florian sketched a longer-term vision that reflects the idealism he said first drew him to the project.
The goal is for Chutes to become fully community-owned and self-organizing β a piece of permanent infrastructure that exists on the internet the way Bitcoin does, unstoppable and accessible to everyone. The current team views itself as custodians, not permanent owners. The endgame is to hand the system to the world.
One speculative but intriguing idea Florian shared involves leveraging the network’s decentralized nature to address AI’s energy problem. Because Chutes miners can operate from anywhere, the network could theoretically incentivize operators to run compute in regions with excess renewable energy, places where solar or wind overproduction would otherwise be wasted. It is always summer somewhere, always daylight somewhere. A decentralized network, unlike a fixed data center in a single location, can follow the cheap energy around the planet.
For a project born from idealism and run by a handful of contributors, that is a remarkable place to be standing.
This article is based on a live presentation by Florian at a technology event in Munich. It has been condensed and organized for clarity. Watch the full video below:
Enjoyed this article? Join our newsletter
Get the latest TAO & Bittensor news straight to your inbox.
We respect your privacy. Unsubscribe anytime.

Be the first to comment