Vocence (SN78) Plans to Become the Front Door to Bittensor

Vocence (SN78) Plans to Become the Front Door to Bittensor
Read Time:3 Minute, 58 Second

For anyone researching for the first time, the easy assumption was that Vocence (Subnet 78) is a voice AI subnet challenging ElevenLabs on naturalness and price. More in-depth research indicates that there’s more to it.

The superior thesis Koyuki laid out is that voice becomes the front door to AI, to agents, and eventually to Bittensor itself.

Most people are never going to live inside dashboards or learn btcli, but they will speak, and if Vocence makes voice work as that interface, the addressable market changes shape entirely.

The Market Vocence Is Walking Into

The AI voice market was around $2.48 Billion in 2025 and is projected at roughly $2.97 Billion in 2026, with ElevenLabs sitting comfortably at the top. 

Their pricing structure is the anchor every challenger is measured against:

ElevenLabs Pricing Schedule

a. Free for basic exploration.

b. Starter at $5/month.

c. Creator at $22/month.

d. Pro at $99/month.

e. Scale at $330/month.

f. Enterprise at custom pricing.

Average runs roughly $0.30 per 1,000 characters. ElevenLabs wins almost every side-by-side on naturalness, which is the wall every voice AI subnet has to walk into. 

Vocence Pricing Plan

Vocence’s structural edge is that subnets pay emissions to miners competing on price and quality, producing a lower cost structure if model quality holds. Nobody has proven a subnet can deliver ElevenLabs-grade voice at lower prices with a real consumer product on top. That is the bet.

What Makes Vocence a Good Bittensor Market

A good subnet needs four things, and Vocence has a defensible answer for each:

a. Hard problem. Human-quality speech from text plus a natural-language voice description, controlling gender, emotion, accent, pitch, and tone simultaneously.

b. Ground truth. Real human speech anchors evaluation. Validators extract traits from real audio, miners generate from that spec, output gets scored against the original.

c. Competition. Intense, close to winner-take-all.

d. Measurable output. Quality, SOTA benchmarks, revenue, and adoption are all verifiable.

Most subnets satisfy two or three of these, Vocence has a credible claim on all four.

The Roadmap Goes Far Beyond TTS

Vocence Project Roadmap

Text-to-speech is the entry point for Vocence, not the destination. The longer vision spans:

a. Speech-to-text alongside the existing TTS layer.

b. Voice-to-voice translation in near real time.

c. Voice cloning and custom voice design.

d. Agentic voice driving end-to-end workflows.

e. APIs and white-label embedding for other products and subnets.

Koyuki kept returning to the point that mass adoption never happens because people care about dTAO or emissions. It happens because the product is cheaper or better than what they already use.

Where the Subnet Stands Right Now

Vocence Network Dashboard

Vocence is roughly one month old, with a lean team, around 132 miners, and 6 validators. Some samples still have audible issues with timing and naturalness. For the thesis to still be interesting in 90 days, several specific things need to happen:

a. Quality has to close the gap with ElevenLabs across the full sample set, not just cherry-picked outputs.

b. Miner and validator counts need to grow without quality collapsing.

c. A consumer-facing product real users can interact with needs to exist.

d. Koyuki needs one distribution moment that lands in front of non-Bittensor users.

If those happen, the thesis becomes real. If they do not, it stays a story.

Why the Founder Is the Strongest Signal

Bittensor has a recurring problem where brilliant teams cannot sell, cannot explain their work clearly, and cannot create attention around it, and the narrative dies regardless of how good the underlying tech is. 

Koyuki does not have that problem; she is intense, fast, aggressive, and visibly confident in her thesis. She wants users, not just emissions, SOTA, but also distribution. 

She once said: “I don’t back down from what I believe.”

The Long-Term Bet

Vocence is betting that voice becomes the easiest way humans interact with AI, and by extension, with Bittensor. Most people will never type commands or navigate dashboards. They will speak, and the network that owns the voice layer when that transition happens captures something far larger than a slice of a $3 Billion TTS market. 

The subnet is young, the product is not yet at the level it needs to be, and the road to ElevenLabs parity is long. But the thesis is sound, the founder is the right kind of operator for it, and if Vocence makes voice work as the front door, the rest of the network benefits from being on the other side of it.

Enjoyed this article? Join our newsletter

Get the latest TAO & Bittensor news straight to your inbox.

We respect your privacy. Unsubscribe anytime.

Be the first to comment

Leave a Reply

Your email address will not be published.


*