Is GLM-5.2 cheaper than the frontier?
A working breakdown of LLM API pricing across Anthropic, OpenAI, Google, and Z.ai, at every reasoning effort. Where the popular claim breaks down, where it could technically be true, and what the data says about intelligence.
What the data shows
Three numbers that answer the question. Source: first-party pricing and Artificial Analysis Intelligence Index v4.1.
cheaper per request at medium reasoning effort
cheaper per request at medium reasoning effort
leading open-weights model, tied with Opus 4.8 and GPT-5.5
What people are saying
A widely-shared take from a respected voice in the space argues GLM-5.2 is more expensive than the proprietary frontier at medium reasoning effort. Worth examining.
I see a lot of people hyped about GLM-5.2. Rightfully so! Having an open weight model surpass GPT-5.4 and every Gemini model is dope.
— Theo - t3.gg (@theo) June 21, 2026
That said - it's not cheap. Both Opus 4.8 and GPT-5.5 set to "medium" are cheaper and smarter than GLM-5.2 pic.twitter.com/SPovI1LKnZ
Theo is right about the first half. GLM-5.2 does surpass GPT-5.4 and every Gemini model on the Intelligence Index. The second half, the cheaper and smarter part, breaks when you run the math.
The pricing math
Reasoning-effort parameters control how many tokens a model spends thinking. They do not change the per-token rate.
The reasoning_effort parameter on GPT-5.5, the extended-thinking budget on Claude, and reasoning_effort on GLM-5.2 all do the same thing. They control how many tokens the model spends thinking. The per-token rate stays fixed regardless of effort mode.
| Model | Provider | Input /MTok | Cache hit /MTok | Output /MTok |
|---|---|---|---|---|
| GLM-5.2 | Z.ai | $1.40 | $0.26 | $4.40 |
| Gemini 3 Pro | $2.00 | $0.20 | $12.00 | |
| Claude Opus 4.8 | Anthropic | $5.00 | $0.50 | $25.00 |
| GPT-5.5 | OpenAI | $5.00 | $0.50 | $30.00 |
For Opus 4.8 at medium effort to come out cheaper than GLM-5.2, Opus 4.8 would need to emit roughly 5.7× fewer reasoning tokens than GLM-5.2, enough to close the per-token gap. In practice, GLM-5.2 uses more reasoning tokens at medium effort than Opus 4.8 does. The per-token rate gap is large enough that GLM-5.2 still wins.
Cost at a real medium-reasoning workload
20K input tokens, 70% prompt-cache hit, 5K output including reasoning. Reasoning tokens are billed as output on every provider.
Cost per request
Sorted cheapest first. Bar length is cost. Color intensity shows where each model ranks.
$0.29 → $0.06
5.1× more expensive on the same workload.
$0.34 → $0.06
6.0× more expensive on the same workload.
$0.13 → $0.06
2.4× more expensive on the same workload.
Smarter? No. They’re tied.
Artificial Analysis Intelligence Index v4.1, a weighted composite of 9 evaluations including GDPval-AA v2, Terminal-Bench v2.1, HLE, GPQA Diamond, and AA-Omniscience.
AA Intelligence Index v4.1
The four frontier models land within two points of each other. Only Claude Fable 5, at $10 and $50 per MTok, clearly leads.
GLM-5.2 at 51 is the leading open-weights model on the index. On GDPval-AA v2 specifically, the real-world agentic work benchmark, GLM-5.2 scores 1524, ahead of GPT-5.5 (xhigh) at 1514. The Intelligence Index lead belongs to Claude Fable 5 (~60), which costs 2 to 3× more than everything else on this chart.
The catch: more output tokens
Theo followed up with a fair point. Cheaper per token does not mean cheaper in time.
It also uses way more output tokens. The tokens are cheaper, but the volume of them means you'll spend much more time waiting for results.
— Theo - t3.gg (@theo) June 21, 2026
Still dope! Just trying to make sure people set their expectations properly pic.twitter.com/hy6NO0CtEq
In his follow-up, Theo added: “the volume of them means you’ll spend much more time waiting for results.” That’s true, and worth being upfront about.
Output tokens per task
GLM-5.2 emits the most output tokens of any leading open-weights model. Lower per-token cost, but more tokens to wait for.
At max thinking effort, GLM-5.2 emits 43k output tokens per Intelligence Index task (37k reasoning, 6k answer). For comparison, MiniMax-M3 emits 24k, Kimi K2.6 emits 35k, and DeepSeek V4 Pro (max) emits 37k. GLM-5.2 is roughly 65% to 80% more verbose than the leanest open-weights peers.
For batch jobs and overnight pipelines, this is irrelevant. The cost advantage dominates. For latency-sensitive interactive use, Opus 4.8 or GPT-5.5 medium will feel snappier, even at higher cost, because they finish thinking in fewer tokens. That is a real trade-off, not a footnote.
Where the claim could be technically true
Three narrow cases where “Opus 4.8 medium beats GLM-5.2 medium on cost” can hold up, if you squint.
Batch API
Anthropic and OpenAI offer 50% discounts on batch jobs. Z.ai has no published batch tier. The gap shrinks to roughly 2.5× instead of 5×.
Subscription vs API
Claude Max or ChatGPT Pro at $200/mo bundles effectively unlimited usage of the top model. That pricing is not comparable to per-token API rates.
Single benchmarks
On individual evals (HLE, certain coding tasks), Opus 4.8 medium or GPT-5.5 medium can outperform GLM-5.2. The composite index places them as tied.
The hype is right
For the first time, an open-weights model is a head-on competitor to the proprietary frontier. That matters.
GLM-5.2 is cheaper than Opus 4.8 and GPT-5.5 at every published reasoning-effort level. It is roughly tied with both on the Artificial Analysis Intelligence Index v4.1. It comes from Z.ai under an MIT license, which means self-hosting, fine-tuning, and zero vendor lock-in are real options. None of the proprietary frontier models offer that.
For the first time, an open-weights model is a head-on competitor to the frontier on cost, intelligence, and availability. That is the right thing for builders who don’t want to depend on a single provider. The hype is correct. Set expectations on latency, then ship.
Sources
Every claim on this page ties back to a first-party source. No invented numbers.
- “Is GLM-5.2 actually cheaper than Opus 4.8 and GPT-5.5?”
- “It also uses way more output tokens...”
- GLM-5.2 is the new leading open weights model
- API pricing — Opus 4.8, Sonnet 4.6, Haiku 4.5
- API pricing — GPT-5.5, GPT-5.4, GPT-5.4 mini
- Vertex AI — Gemini 3 Pro pricing
- GLM-5.2 quick start
Last updated June 21, 2026. Pricing pulled from first-party pricing pages on this date. Intelligence Index v4.1 published June 15, 2026.
Content crafted by The Spiel Engine, from a comparison session, edited and re-designed to HTML.