Is GLM-5.2 cheaper than the frontier?

The numbers

What the data shows

Three numbers that answer the question. Source: first-party pricing and Artificial Analysis Intelligence Index v4.1.

vs GPT-5.5

5.6×

cheaper per request at medium reasoning effort

vs Claude Opus 4.8

5.1×

cheaper per request at medium reasoning effort

AA Intelligence Index v4.1

51

leading open-weights model, tied with Opus 4.8 and GPT-5.5

The claim being addressed

What people are saying

A widely-shared take from a respected voice in the space argues GLM-5.2 is more expensive than the proprietary frontier at medium reasoning effort. Worth examining.

I see a lot of people hyped about GLM-5.2. Rightfully so! Having an open weight model surpass GPT-5.4 and every Gemini model is dope.

That said - it's not cheap. Both Opus 4.8 and GPT-5.5 set to "medium" are cheaper and smarter than GLM-5.2 pic.twitter.com/SPovI1LKnZ
— Theo - t3.gg (@theo) June 21, 2026

Theo is right about the first half. GLM-5.2 does surpass GPT-5.4 and every Gemini model on the Intelligence Index. The second half, the cheaper and smarter part, breaks when you run the math.

The pricing data

The pricing math

Reasoning-effort parameters control how many tokens a model spends thinking. They do not change the per-token rate.

The reasoning_effort parameter on GPT-5.5, the extended-thinking budget on Claude, and reasoning_effort on GLM-5.2 all do the same thing. They control how many tokens the model spends thinking. The per-token rate stays fixed regardless of effort mode.

Per-token pricing published June 21, 2026. Anthropic cache write is billed separately at $6.25/MTok; OpenAI and Google cache write is free.
Model	Provider	Input /MTok	Cache hit /MTok	Output /MTok
GLM-5.2	Z.ai	$1.40	$0.26	$4.40
Gemini 3 Pro	Google	$2.00	$0.20	$12.00
Claude Opus 4.8	Anthropic	$5.00	$0.50	$25.00
GPT-5.5	OpenAI	$5.00	$0.50	$30.00

For Opus 4.8 at medium effort to come out cheaper than GLM-5.2, Opus 4.8 would need to emit roughly 5.7× fewer reasoning tokens than GLM-5.2, enough to close the per-token gap. In practice, GLM-5.2 uses more reasoning tokens at medium effort than Opus 4.8 does. The per-token rate gap is large enough that GLM-5.2 still wins.

The workload comparison

Cost at a real medium-reasoning workload

20K input tokens, 70% prompt-cache hit, 5K output including reasoning. Reasoning tokens are billed as output on every provider.

Cost per request

Sorted cheapest first. Bar length is cost. Color intensity shows where each model ranks.

Opus 4.8 → GLM-5.2

$0.29 → $0.06

5.1× more expensive on the same workload.

GPT-5.5 → GLM-5.2

$0.34 → $0.06

6.0× more expensive on the same workload.

Gemini 3 Pro → GLM-5.2

$0.13 → $0.06

2.4× more expensive on the same workload.

The intelligence comparison

Smarter? No. They’re tied.

Artificial Analysis Intelligence Index v4.1, a weighted composite of 9 evaluations including GDPval-AA v2, Terminal-Bench v2.1, HLE, GPQA Diamond, and AA-Omniscience.

AA Intelligence Index v4.1

The four frontier models land within two points of each other. Only Claude Fable 5, at $10 and $50 per MTok, clearly leads.

GLM-5.2 at 51 is the leading open-weights model on the index. On GDPval-AA v2 specifically, the real-world agentic work benchmark, GLM-5.2 scores 1524, ahead of GPT-5.5 (xhigh) at 1514. The Intelligence Index lead belongs to Claude Fable 5 (~60), which costs 2 to 3× more than everything else on this chart.

The trade-off Theo flagged

The catch: more output tokens

Theo followed up with a fair point. Cheaper per token does not mean cheaper in time.

It also uses way more output tokens. The tokens are cheaper, but the volume of them means you'll spend much more time waiting for results.

Still dope! Just trying to make sure people set their expectations properly pic.twitter.com/hy6NO0CtEq
— Theo - t3.gg (@theo) June 21, 2026

In his follow-up, Theo added: “the volume of them means you’ll spend much more time waiting for results.” That’s true, and worth being upfront about.

Output tokens per task

GLM-5.2 emits the most output tokens of any leading open-weights model. Lower per-token cost, but more tokens to wait for.

At max thinking effort, GLM-5.2 emits 43k output tokens per Intelligence Index task (37k reasoning, 6k answer). For comparison, MiniMax-M3 emits 24k, Kimi K2.6 emits 35k, and DeepSeek V4 Pro (max) emits 37k. GLM-5.2 is roughly 65% to 80% more verbose than the leanest open-weights peers.

For batch jobs and overnight pipelines, this is irrelevant. The cost advantage dominates. For latency-sensitive interactive use, Opus 4.8 or GPT-5.5 medium will feel snappier, even at higher cost, because they finish thinking in fewer tokens. That is a real trade-off, not a footnote.

The honest answer

Where the claim could be technically true

Three narrow cases where “Opus 4.8 medium beats GLM-5.2 medium on cost” can hold up, if you squint.

Narrow case 1

Batch API

Anthropic and OpenAI offer 50% discounts on batch jobs. Z.ai has no published batch tier. The gap shrinks to roughly 2.5× instead of 5×.

Narrow case 2

Subscription vs API

Claude Max or ChatGPT Pro at $200/mo bundles effectively unlimited usage of the top model. That pricing is not comparable to per-token API rates.

Narrow case 3

Single benchmarks

On individual evals (HLE, certain coding tasks), Opus 4.8 medium or GPT-5.5 medium can outperform GLM-5.2. The composite index places them as tied.

The bottom line

The hype is right

For the first time, an open-weights model is a head-on competitor to the proprietary frontier. That matters.

GLM-5.2 is cheaper than Opus 4.8 and GPT-5.5 at every published reasoning-effort level. It is roughly tied with both on the Artificial Analysis Intelligence Index v4.1. It comes from Z.ai under an MIT license, which means self-hosting, fine-tuning, and zero vendor lock-in are real options. None of the proprietary frontier models offer that.

For the first time, an open-weights model is a head-on competitor to the frontier on cost, intelligence, and availability. That is the right thing for builders who don’t want to depend on a single provider. The hype is correct. Set expectations on latency, then ship.

References

Sources

Every claim on this page ties back to a first-party source. No invented numbers.

Theo (X)“Is GLM-5.2 actually cheaper than Opus 4.8 and GPT-5.5?”
Theo (X)“It also uses way more output tokens...”
Artificial AnalysisGLM-5.2 is the new leading open weights model
AnthropicAPI pricing — Opus 4.8, Sonnet 4.6, Haiku 4.5
OpenAIAPI pricing — GPT-5.5, GPT-5.4, GPT-5.4 mini
Google CloudVertex AI — Gemini 3 Pro pricing
Z.aiGLM-5.2 quick start

Last updated June 21, 2026. Pricing pulled from first-party pricing pages on this date. Intelligence Index v4.1 published June 15, 2026.