Chutes Unlocks Affordable, Collaborative AI Deployment

Listen to this article

Read Time:2 Minute, 27 Second

Chutes has implemented a significant update to its cost and deployment framework, reshaping how developers interact with the platform. The changes introduce clearer billing rules, a new one-time deployment fee, and private-by-default chutes that can be shared with collaborators.

The goal is to align infrastructure costs with real usage while reducing inefficiencies, providing developers and teams with more predictable economics and flexible collaboration models.

Chutes has rolled out an upgrade to its pricing model and deployment structure, introducing new rules for private deployments, cost-sharing, and clearer billing mechanics. These changes are designed to give developers finer control over costs while enabling more transparent, team-friendly usage.

Table of Contents

Introduction

Chutes, Subnet 64, is a decentralized serverless AI compute platform on the Bittensor network. It provides an open, on-demand inference service that lets developers deploy and scale AI models in seconds without managing infrastructure.

With a simple API and web UI connected to a global GPU miner network, Chutes offers a Web3-native alternative to centralized AI clouds, supporting a wide range of models from LLMs to image and audio systems.

The recent upgrades are as follows:

Private, Shareable Custom Chutes

Every chute is now private by default. Developers can selectively share access with teammates or collaborators.

When shared users call the chute, they are billed per use. The spend from collaborators is credited back to the original deployer’s balance, offsetting running costs.

With this upgrade, developers now gain stronger control over access to their models. Shared usage reduces individual spend, encouraging collaborative development and testing.

One-Time Deployment Fee

New deployments now include a small upfront fee. The fee is calculated as 3 times the hourly rate of the cheapest compatible GPU, multiplied by the number of GPUs. Also, updating an existing chute carries no additional fee.

This model encourages efficient deployment and reduces unnecessary churn from frequent redeployments.

Clearer Scaling and Billing Rules

Chutes has refined its billing mechanics to ensure that:

a. Users pay the standard hourly rate per active GPU while an instance is “hot.”
b. There are no charges for cold starts (loading models or images).
c. After traffic stops, billing continues only for the set auto-shutdown period.
d. Developers retain full control over scaling behaviors.

With this clear set of rules, developers can minimize idle costs by carefully tuning scaling knobs. This transparent billing reduces uncertainty in operational costs.

Pricing References

GPU billing is now tied to the cheapest compatible GPU for a chute. For example, if a model can run on both A100 and H100, billing reflects the A100 rate, even if deployed on an H100. This makes pricing predictable and aligned with minimum viable hardware.

Model Pricing Refresh (PAYGO)

A new model pricing scheme for PAYGO users has been introduced. The model ensures that current deployments remain unchanged under temporary overrides, preventing disruptions mid-work.

Resources

Official Website: https://chutes.ai

Twitter: https://x.com/chutes_ai

Chutes Unlocks Affordable, Collaborative AI Deployment