
Chutes has implemented a significant update to its cost and deployment framework, reshaping how developers interact with the platform. The changes introduce clearer billing rules, a new one-time deployment fee, and private-by-default chutes that can be shared with collaborators.
The goal is to align infrastructure costs with real usage while reducing inefficiencies, providing developers and teams with more predictable economics and flexible collaboration models.
Chutes has rolled out an upgrade to its pricing model and deployment structure, introducing new rules for private deployments, cost-sharing, and clearer billing mechanics. These changes are designed to give developers finer control over costs while enabling more transparent, team-friendly usage.
Introduction
Chutes, Subnet 64, is a decentralized serverless AI compute platform on the Bittensor network. It provides an open, on-demand inference service that lets developers deploy and scale AI models in seconds without managing infrastructure.
With a simple API and web UI connected to a global GPU miner network, Chutes offers a Web3-native alternative to centralized AI clouds, supporting a wide range of models from LLMs to image and audio systems.
The recent upgrades are as follows:
Private, Shareable Custom Chutes
Every chute is now private by default. Developers can selectively share access with teammates or collaborators.
When shared users call the chute, they are billed per use. The spend from collaborators is credited back to the original deployer’s balance, offsetting running costs.
With this upgrade, developers now gain stronger control over access to their models. Shared usage reduces individual spend, encouraging collaborative development and testing.
One-Time Deployment Fee
New deployments now include a small upfront fee. The fee is calculated as 3 times the hourly rate of the cheapest compatible GPU, multiplied by the number of GPUs. Also, updating an existing chute carries no additional fee.
This model encourages efficient deployment and reduces unnecessary churn from frequent redeployments.
Clearer Scaling and Billing Rules
Chutes has refined its billing mechanics to ensure that:
a. Users pay the standard hourly rate per active GPU while an instance is “hot.”
b. There are no charges for cold starts (loading models or images).
c. After traffic stops, billing continues only for the set auto-shutdown period.
d. Developers retain full control over scaling behaviors.
With this clear set of rules, developers can minimize idle costs by carefully tuning scaling knobs. This transparent billing reduces uncertainty in operational costs.
Pricing References
GPU billing is now tied to the cheapest compatible GPU for a chute. For example, if a model can run on both A100 and H100, billing reflects the A100 rate, even if deployed on an H100. This makes pricing predictable and aligned with minimum viable hardware.
Model Pricing Refresh (PAYGO)
A new model pricing scheme for PAYGO users has been introduced. The model ensures that current deployments remain unchanged under temporary overrides, preventing disruptions mid-work.
Resources
Official Website: https://chutes.ai
Twitter: https://x.com/chutes_ai
Be the first to comment