The GLM Coding Plan is the cheapest big-name AI coding subscription on the market: Z.ai's Lite tier lists at $18 a month, undercutting Claude Pro and ChatGPT Plus at $20. But the number that actually decides whether it's worth it isn't the price — it's the quota multiplier. GLM-5.2, the flagship model you'd subscribe for, burns your prompt allowance up to 3x faster than the baseline model.
That distinction is where most coverage of this plan falls apart. Cheap-plan write-ups repeat Z.ai's tier table verbatim; skeptical takes wave at "Chinese model risk" without doing any math. Neither tells you what a dollar of GLM subscription actually buys relative to a dollar of Claude or ChatGPT — or admits that the three vendors meter usage in units that genuinely don't convert into each other.
This analysis does the arithmetic from primary sources: Z.ai's own pricing and quota documentation, Anthropic's published plan pages, and OpenAI's Codex pricing docs, all checked on July 3, 2026. It covers the tier mechanics, the 3x/2x GLM-5.2 multiplier, a cross-vendor value table, the vendor's own "15–30x" claim recomputed in dollars, the honest benchmark picture — and a section most plan reviews skip entirely: who should not subscribe.
- 01Lite lists at $18 — the cheapest big-name coding plan.It undercuts Claude Pro and ChatGPT Plus ($20) on sticker price, and yearly billing drops it to an effective $12.60/month. The trade: far narrower model access than either rival's subscription.
- 02GLM-5.2 drains quota 3x at peak, 2x off-peak — a usage multiplier, not a price hike.The subscription price never changes. What changes is how fast the flagship model consumes your prompt allowance versus GLM-4.7. Routing routine work to GLM-4.7 stretches every tier substantially.
- 03Quota windows hard-stop — there is no overflow billing.When a 5-hour window is exhausted, calls fail until it resets. Z.ai states the plan never drains your account balance — structurally different from metered usage-credit systems that keep charging past a soft cap.
- 04Z.ai's own math values the plan at 15–30x the fee in API-equivalent terms.By the vendor's stated methodology, Lite's quota is worth roughly $270–540/month at API rates, Pro $1,080–2,160, and Max $2,400–4,800. Vendor-stated and not independently audited — but checkable arithmetic.
- 05Skip it for long-horizon agents, deep Claude workflows, and residency-bound teams.Aggregator-tracked long-horizon evals put GLM-5.2 at roughly half of Opus 4.8's scores (SWE-Marathon 13.0 vs 26.0), deep Claude Code ecosystem users lose the model their harness was tuned for, and prompts route through China-based infrastructure.
01 — The TiersWhat $18, $72, and $160 actually buy.
The GLM Coding Plan sells in three individual tiers, priced at list as Lite $18, Pro $72, and Max $160 per month. Z.ai describes Pro as "Everything in Lite, plus 5x Lite usage" and Max as "Everything in Pro, plus 20x Lite usage" — same model access, same features, more quota. Every tier includes support for 20+ coding tools, with Claude Code, Cline, and OpenCode among those Z.ai names directly.
Two pricing wrinkles are worth knowing before the tier cards. First, billing frequency is a standing discount lever: Z.ai's subscribe page shows monthly, quarterly, and yearly toggles at −10%, −20%, and −30% off list respectively, which makes yearly Lite an effective $12.60 a month. Second, you may see $16.20 / $64.80 / $144 displayed on Z.ai's ZCode site — that is discounted display pricing (the 10%-off figure), not a different plan; this analysis quotes list throughout.
Each tier also bundles a monthly allowance of MCP tool calls shared across Z.ai's Web Search, Web Reader, and Zread MCPs — 100 on Lite, 1,000 on Pro, 4,000 on Max — while the Vision MCP draws from the same 5-hour prompt pool as normal usage. For scale: Web Search costs $0.01 per use on Z.ai's pay-as-you-go API price list, so the bundled search allowance is a small sweetener, not a headline.
Lite
The tier that undercuts every $20 rival. Rolling 5-hour windows plus a reported ~400-prompt weekly cap, 100 bundled MCP calls a month, and the full 20+ coding-tool support. The honest starting point for a one-month test.
Pro
Roughly ~400 prompts per 5-hour window and ~2,000 a week (reported), plus 1,000 monthly MCP calls. The realistic tier for a full-time developer driving GLM-5.2 daily once the multiplier is factored in.
Max
Reported ~1,600 prompts per 5-hour window and ~8,000 a week, with 4,000 MCP calls. Sized for parallel agent sessions and teams sharing heavy throughput — still cheaper at list than one Claude Max 20x seat.
The per-tier prompt counts deserve a hedge that Z.ai itself applies: the documentation frames them with "up to" language, as approximate ceilings rather than guarantees, and each "prompt" is estimated to fan out into 15–20 model calls as agentic sessions invoke tools and re-plan. We first covered the plan's structure in our GLM-5.2 launch coverage — what follows is the value math that launch-day reporting couldn't yet do.
02 — Quota MechanicsThe 3x multiplier is the real price tag.
Usage is governed by two simultaneous caps: a rolling 5-hour window and a weekly ceiling that resets on a 7-day cycle from your purchase date. But the mechanic that actually determines value is the model multiplier. Per Z.ai's own devpack FAQ, GLM-5.2 and GLM-5-Turbo consume plan quota at 3x the standard rate during peak hours (14:00–18:00 UTC+8) and 2x during off-peak hours. To be precise about what that means: it is a quota-consumption multiplier — how fast your prompt allowance depletes — not a price multiplier. Your subscription cost never changes.
Z.ai's own model guidance makes the practical strategy explicit. The FAQ pitches GLM-4.7 as comparable to Claude's Sonnet-level model and sufficient for daily development, with GLM-5.2 positioned as its Opus-level counterpart for complex reasoning and large-scale engineering — Z.ai's framing, not an independent equivalence claim. Route routine work to GLM-4.7 at 1x and reserve GLM-5.2 for the hard problems, and every tier stretches dramatically further. Run GLM-5.2 for everything at peak, and your effective allowance is a third of the sticker numbers.
What one Lite 5-hour window actually buys, by model
Derived by Digital Applied from Z.ai's stated 3x/2x multipliers and the reported ~80-prompt Lite window — approximate ceilings, not guaranteesThird-party pricing trackers additionally report a time-limited promo easing the off-peak multiplier from 2x to 1x through the end of September 2026. We could not find that clause on Z.ai's primary devpack pages when we fetched them directly on July 3, so treat it as reported rather than confirmed — if it applies to your hours, it roughly doubles off-peak GLM-5.2 throughput while it lasts.
The other mechanic that shapes value is what happens at the ceiling: nothing. There is no overflow billing, no automatic fallback to your API balance — calls simply fail until the window resets. Depending on your workflow that's either a hard limitation or the feature that makes the plan's cost genuinely fixed.
"Once the quota is used up, you'll need to wait until the next 5-hour cycle for it to refresh. The system will not deduct from your account balance."— Z.ai devpack FAQ, retrieved July 3, 2026
03 — Cross-Vendor MathGLM vs Claude vs ChatGPT — in each vendor's own units.
Here is the comparison most coverage fudges. The three subscription families meter usage in vocabularies that do not convert into each other: Z.ai counts "prompts" per rolling 5-hour window (each fanning into an estimated 15–20 model calls), OpenAI counts Codex "messages" per 5-hour window shared between local and cloud tasks, and Anthropic uses session-based allowances with two separate weekly caps — one across all models, one specifically for Sonnet-family usage. Any table that pretends these are one unit is fiction. So this one doesn't: each row states usage exactly as its vendor does.
| Plan | List / mo | Annual billing / mo | Usage, as the vendor states it | At the ceiling |
|---|---|---|---|---|
| GLM Coding Plan (Z.ai) | ||||
| Lite | $18 | $12.60 (−30% yearly) | ~80 prompts / 5h + ~400 / week (reported, approximate) | Hard stop until reset — no overflow billing |
| Pro | $72 | $50.40 (−30% yearly) | 5x Lite — ~400 prompts / 5h + ~2,000 / week (reported) | Hard stop until reset — no overflow billing |
| Max | $160 | $112 (−30% yearly) | 20x Lite — ~1,600 prompts / 5h + ~8,000 / week (reported) | Hard stop until reset — no overflow billing |
| Claude (Anthropic) | ||||
| Pro | $20 | $17 (annual, $200 up front) | Session-based allowance; two separate weekly caps (all-model + Sonnet-family) | Wait for reset; Fable 5 moves to metered usage-credits from July 8, 2026 |
| Max 5x | $100 | $100 (no annual toggle listed) | 5x Pro usage per session | Wait for reset; higher output limits, peak-traffic priority |
| Max 20x | $200 | $200 (no annual toggle listed) | 20x Pro usage per session | Wait for reset; higher output limits, peak-traffic priority |
| ChatGPT / Codex (OpenAI) | ||||
| Plus | $20 | — (none published) | 15–80 Codex messages / 5h (GPT-5.5) | Wait for reset; limits shared between local and cloud tasks |
| Pro 5x | $100 | — (none published) | 75–400 messages / 5h | Wait for reset; limits shared between local and cloud tasks |
| Pro 20x | $200 | — (none published) | 300–1,600 messages / 5h | Wait for reset; limits shared between local and cloud tasks |
Three reading notes. First, Claude Pro's $20 buys far more than a coding quota — Claude Code plus the Cowork, Design, and Science apps, Research access, and multi-model access per Anthropic's pricing page. GLM's $18 buys coding-tool quota on one model family. Cheaper, yes; equivalent, no. Second, OpenAI restructured ChatGPT Pro in April 2026 — TechCrunch reported the split of the old flat $200 tier into today's $100 Pro 5x and $200 Pro 20x, and OpenAI's Codex pricing page confirms Codex is bundled into every paid tier rather than sold separately. Anything you read about a single "$200 ChatGPT Pro" is now half the picture.
Third — and structurally most interesting — the ceilings behave differently. GLM's hard stop makes spend perfectly predictable and productivity interruptible. Anthropic is moving its top model in the opposite direction: Fable 5 shifts to metered usage-credit billing from July 8, 2026, which trades the hard stop for pay-as-you-exceed continuity. We've broken down what happens to Claude pricing after the July 7 grace period ends and, more broadly, how subscription plans and metered usage-credits compare across vendors — the GLM plan sits at the fixed-cost extreme of that spectrum.
04 — Vendor's Own MathThe "15–30x" claim, recomputed in dollars.
Z.ai's documentation makes a specific, checkable value claim: "The monthly available quota is converted based on API pricing, equivalent to approximately 15–30× the monthly subscription fee (weekly caps already factored in)." That is the vendor's own methodology — not an independent audit — but nobody else seems to run the numbers on it, so here they are, computed as 15x and 30x of each tier's list price.
Claimed API-equivalent value
Z.ai's stated conversion applied to the $18 list price. At GLM-5.2's published $1.40-input / $4.40-output per-Mtok rates, that range represents a serious volume of tokens for a plan costing less than one lunch a week.
Claimed API-equivalent value
The same conversion at Pro's $72 list. Even at the conservative 15x end, the claimed equivalent exceeds what most individual developers would ever spend on raw API calls in a month.
Claimed API-equivalent value
At Max's $160 list, Z.ai's own math claims up to $4,800 of API-equivalent usage. If your team genuinely consumes at that level, the subscription-vs-API arbitrage is the entire business case.
The same honesty cuts the other way on raw token rates. GLM-5.2's API list price is $1.40 per million input tokens and $4.40 per million output tokens (with cached input at $0.26). Against Anthropic's published rates, that makes GLM-5.2 roughly 3.6x cheaper on input and 5.7x cheaper on output than Claude Opus 4.8 ($5 / $25), and roughly 7.1x and 11.4x cheaper than Fable 5 ($10 / $50). Those are price ratios on per-token rates — explicitly not a claim that a GLM token does the same work as a Claude token. The benchmark section below is where that distinction gets quantified.
05 — Benchmark HonestyNear-frontier single-shot, half-frontier long-horizon.
GLM-5.2 — announced June 13, 2026, with MIT-licensed open weights following on June 16 — is the model the plan's value case rests on. The honest one-line summary: it is near-frontier on many single-shot coding benchmarks at a fraction of the cost, but trails Opus 4.8 on sustained long-horizon agent work. The chart below normalizes each benchmark to GLM-5.2's score as a share of Opus 4.8's, mixing Z.ai's vendor-stated numbers with independent aggregator tracking — each row is labeled with which is which.
GLM-5.2 as a share of Claude Opus 4.8's score · five coding benchmarks
Sources: Z.ai model documentation (vendor-stated rows) and independent benchmark aggregators, July 2026 · shares computed by Digital AppliedRead the shape, not the individual bars. On single-shot, well-scoped tasks — the FrontierSWE and Terminal-Bench profile — GLM-5.2 sits within a few points of Opus 4.8 while its tokens cost a fraction as much. On the evals that measure sustained autonomous work across a repository over hours, the gap roughly doubles: aggregators track GLM-5.2 at about 70% of Opus on NL2Repo and about half of Opus on SWE-Marathon. Note also that vendor-stated and independently-tracked numbers diverge on SWE-bench Pro — Z.ai's own 62.1 stands, but aggregators place Opus 4.8 meaningfully ahead at 69.2. For the full methodology and per-benchmark sourcing, see our benchmark comparison against Claude Opus.
This is why the quota multiplier and the benchmark gap have to be read together. The plan is cheapest precisely where the model is strongest — high-volume, single-shot, human-in-the-loop coding — and weakest exactly where premium Claude subscriptions earn their price: long-horizon agent runs you walk away from. That is not a flaw in either product; it is a segmentation the pricing already reflects.
06 — Decision MatrixWho the plan fits — five buyer profiles.
Independent reviews have started landing where our math does — HyScaler's June write-up, for one, concluded the plan is not recommended for casual users. The value case is concentrated in specific profiles. Find yours below; the three "skip" rows get the full argument in the next section.
Solo developer shipping daily, single-shot-heavy work
The strongest fit. High prompt volume, human-in-the-loop iteration, price sensitivity. Even at GLM-5.2's 3x peak draw, Lite's reported window beats paying per token — and GLM-4.7 at 1x covers most routine work.
Day-to-day feature work with a shared toolchain
Pro's 5x quota plus disciplined routing — GLM-4.7 for routine tasks, GLM-5.2 reserved for hard problems per Z.ai's own tiering guidance — makes the per-seat math hard to argue with for CRUD-heavy product work.
Long-horizon, multi-hour autonomous agent runs
The aggregator-tracked gap is roughly 2x on SWE-Marathon (13.0 vs 26.0) and wide on NL2Repo (48.9 vs 69.7). If your workflow is kicking off agents that run unattended for hours, the cheap quota buys reruns, not results.
Deep hooks, subagents, and skills investment
Routing Claude Code to GLM works mechanically, but your harness — hooks, subagents, skills, tuned conventions — was built against Anthropic's models. You'd be trading a tuned system for a cheaper engine it wasn't tuned for.
Data-residency or compliance constraints
Plan traffic routes through Z.ai's China-based infrastructure. If your clients, sector, or counsel require non-China data residency, the subscription is off the table regardless of price — self-hosted open weights are the only GLM path.
07 — The Honest PartWho should not subscribe — three disqualifiers.
This is the section most plan reviews skip, and it's where the actual decision gets made. Three profiles should keep their money — or their existing subscription.
1. Long-horizon agent workloads
If the work you'd route to this plan is sustained autonomous agent work — multi-hour refactors, repo-scale migrations, agents that plan, execute, and self-correct without a human between steps — the benchmark gap stops being academic. At roughly half of Opus 4.8's SWE-Marathon score in aggregator tracking, the failure mode isn't "slightly worse code"; it's runs that lose the thread partway and burn quota producing work you discard. Cheap tokens that need re-running aren't cheap.
2. Deep Claude-ecosystem investment
The plan's headline trick — pointing Claude Code at GLM models — works, and we document it in our setup guide for routing Claude Code through GLM-5.2. But if you've invested seriously in Claude Code's ecosystem — hooks, subagent orchestration, skills, CLAUDE.md conventions refined against Anthropic's models — swapping the model underneath changes the behavior every one of those layers was tuned for. Anthropic's premium buys the model that harness was built around; for power users, that integration is worth more than the $2-a-month sticker difference.
3. Data-residency and compliance constraints
One honest sentence covers it: subscribing routes your prompts and code through a China-based provider's infrastructure, which is subject to PRC data-governance law — and U.S. congressional scrutiny of Chinese-origin AI models has escalated through 2026, including a joint House investigation into PRC-developed AI models opened in April. For most indie developers this is a judgment call; for regulated industries and client work under confidentiality obligations, it's a blocker.
08 — Verdict & Getting StartedA one-month test, not an annual commitment.
The verdict, then. For high-volume, single-shot-heavy coding where quota per dollar is the binding constraint, the plan's arithmetic holds up even after the 3x multiplier — that's what the cross-vendor table and Z.ai's own 15–30x math both point to. For the three profiles above, it doesn't, at any price. And because subscriptions are non-refundable once purchased, the rational entry is the smallest one: monthly Lite, not the −30% yearly deal, until your own usage data says otherwise.
If the arithmetic matches your workload, the honest test costs $18: subscribe to the GLM Coding Plan Lite tier for a single month, run a real week of your actual work through it, and upgrade only if you hit window ceilings. Referral link: we earn Z.ai platform credits if you subscribe, and new Z.ai accounts get 10% off their first subscription order.
The 10% referral discount applies only to a new account's first order, can't be stacked with other first-order discounts, and requires completing payment within 72 hours of clicking the link. Z.ai publishes no end date for the offer.
Setup is a one-liner — Z.ai's subscribe page ships an npx @z_ai/coding-helper installer that loads the plan into your coding tool of choice, or you can wire Claude Code manually via environment variables using the exact setup steps in our guide. Worth knowing before you pick a surface: Z.ai also shipped ZCode, its desktop agentic development environment, the week of July 1, 2026 — and plan subscribers currently get roughly 1.5x effective GLM-5.2 quota inside ZCode under a separate, vendor-published promo that Z.ai says runs until July 31, 2026. Our ZCode guide covers what the environment does and doesn't change about the plan's economics.
And if you'd rather have this evaluation run against your actual repositories — plan-vs-plan, on your workload, with routing recommendations instead of vendor benchmarks — that comparative eval is exactly where our AI transformation engagements start.
09 — ConclusionCheap is a strategy, not a verdict.
The $18 question isn't the price — it's whether your workload fits the quota.
The GLM Coding Plan is genuinely the cheapest serious coding subscription available, and the discount is real rather than promotional: list prices of $18/$72/$160, a standing −30% for yearly billing, and a hard-stop quota design that makes the spend perfectly predictable. For high-volume, single-shot coding, the value per dollar is hard to beat — Z.ai's own math prices the quota at 15–30x the fee, and even discounted for vendor optimism, the arbitrage is wide.
The trend worth reading underneath the price tag: Z.ai is running a distribution-first strategy, selling quota below API-equivalent value to win a seat inside the coding harnesses developers already use. The 3x/2x multiplier is the throttle that makes that sustainable — and it's the lever to watch, because vendors adjust multipliers and promos far more quietly than they adjust prices. Meanwhile the market is splitting into two cap philosophies: hard-stop windows at the budget end, metered usage-credits at the frontier end, with Anthropic's July 8 Fable 5 shift the clearest marker of the latter.
Looking forward, expect the specifics here to move: the reported off-peak promo runs through September, the ZCode quota promo is published to end July 31, and every vendor in the table has re-priced at least once this year. Treat this page as a method, not a snapshot — recompute the three tables against live pricing pages before you commit, start monthly, and let one month of your own telemetry, not anyone's benchmark chart, make the annual decision.