Chutes Deploys GLM-5.2 With Hardware-Level Privacy and Frontier Coding Capabilities

Read Time:2 Minute, 29 Second

Chutes (SN64) just started serving GLM-5.2, the new flagship from Z.ai that is currently the strongest open-source coding model available.

The model scores 81.0 on Terminal-Bench 2.1, sitting within four points of Claude Opus 4.8’s 85.0 and clearing Gemini 3.1 Pro. It runs inside a Trusted Execution Environment on Chutes, meaning the GPU operators serving the model cannot read prompts or outputs at the hardware level.

Pricing is $1.40 per million tokens in and $4.40 per million tokens out, putting genuine frontier-adjacent coding performance on decentralized infrastructure at a fraction of typical closed-model rates.

Table of Contents

Performance, Privacy, Price

The release is notable on three separate axes, each of which would be meaningful on its own.

a. On performance:

Z.ai: GLM-5.2 Performance Evaluation With Other LLM Models

1. Terminal-Bench 2.1: 81.0 – Within four points of Claude Opus 4.8 (85.0) and ahead of Gemini 3.1 Pro.

2. SWE-bench Pro: 62.1 – A jump from GLM-5.1’s 58.4, putting it among the strongest open-source results on the benchmark.

3. Native 1M-Token Context – The model handles long-horizon coding and agentic workloads without requiring external chunking.

4. IndexShare Attention Design – A new attention mechanism that cuts per-token compute by approximately 2.9x at full context, which makes the long context usable at production cost.

b. On privacy:

1. TEE Chute Deployment – The model runs inside a Trusted Execution Environment, which means GPU operators on the Chutes network cannot read prompts, outputs, or any data flowing through the model. Confidentiality is enforced at the hardware level rather than by policy.

c. On price:

1. $1.40 input / $4.40 Output Per Million Tokens – Aggressive enough to make sustained agentic and long-context coding workloads economically viable for developers running them at scale.

The combination is what makes the release land. Open-source models have caught up on raw capability for narrow tasks, but most of them ship without privacy guarantees and at price points that erode the economic case. GLM-5.2 on Chutes delivers all three in one package.

Open Models, Real Privacy

The takeaway from this release is what it says about where the open-source side of the AI stack is heading. A model that scores within four points of Claude Opus 4.8 on Terminal-Bench is no longer a research curiosity. Add 1M context with a new attention design and confidential compute at the hardware level, and it becomes a production tool.

Chutes serving the model on decentralized infrastructure with TEE guarantees closes the last gap that has historically pushed serious workloads back to closed providers: confidentiality. The model is available now at the listed pricing. The benchmarks are public. The privacy is at the GPU level rather than the policy level.

➛ Explore GLM-5.2 on Chutes (SN64) Here

Enjoyed this article? Join our newsletter

Get the latest TAO & Bittensor news straight to your inbox.

We respect your privacy. Unsubscribe anytime.

Chutes Deploys GLM-5.2 With Hardware-Level Privacy and Frontier Coding Capabilities

Performance, Privacy, Price

Open Models, Real Privacy

Enjoyed this article? Join our newsletter

Like this:

Be the first to comment

Leave a Reply Cancel reply

Top 10 Frequently Asked Questions About the Root Reborn Proposal

Performance, Privacy, Price

Open Models, Real Privacy

Enjoyed this article? Join our newsletter

Like this:

Related Articles

Exploring Bittensor and Its Early Internet Momentum

Like this:

Parallax Says Data Centers Are Optional Now

Like this:

Full Recap of Chutes’ NS: Generating $1.3M in Revenue and Challenging OpenAI with Privacy & Efficiency

Like this:

Be the first to comment

Leave a Reply Cancel reply