The choice between AI subscriptions and usage credits stopped being academic the week of July 1, 2026. Anthropic announced that Claude Fable 5 would move to metered usage credits after July 7, while OpenAI — in the same window — publicly defended keeping Codex included on every ChatGPT tier with overage credits left optional. Two labs, opposite bets, one question landing on every budget owner's desk: do you pay a flat fee, or pay by the token?
What is at stake is not which model is smartest — it is which billing shape you can plan around. A subscription is a fixed line item your finance team already understands. Usage credits are a variable, consumption-driven cost that behaves more like a cloud bill: cheap when idle, alarming when an agent loop runs hot. Get the match wrong and you either leave capability on the table or discover a five-figure overage after the fact.
This guide separates the two stories the press covered separately — Anthropic's metering move and OpenAI's model preview — and reads them as one strategic split. Then it hands you the part nobody else published: a tier-by-tier included-vs-metered matrix across OpenAI, Anthropic, and Google, and a per-workload framework for deciding which meter to standardize on. Every figure below is dated and sourced; where a number is vendor-stated or secondary-sourced, it is marked as such.
- 01The same week produced two opposite bets.Anthropic moved Claude Fable 5 to usage credits after July 7, 2026; OpenAI kept Codex included on every ChatGPT tier with overage credits explicitly optional. The split is about billing shape, not model quality.
- 02Anthropic's meter has a hard cutoff.Fable 5 is included on Pro, Max, and Team for up to 50% of each plan's weekly limits through July 7; after that, an org that has not enabled usage credits loses Fable 5 access entirely. Standard Enterprise seats get no grace window at all.
- 03OpenAI's overage credits are opt-in.Per OpenAI's help center, Codex is included across Free, Go, Plus, Pro, Business, Edu, and Enterprise. Buying credit packs is only needed to unblock users past their limits — otherwise you simply wait for the limit to reset.
- 04The 'tokenmaxxing' era is visibly ending.CNBC reports enterprises shifting from maximizing token use to controlling it: Uber introduced AI-tool spend tiers starting at a $1,500/employee/month base, and AI startup Lindy moved all of its model traffic off Claude to a cheaper model to survive its cost curve.
- 05Match the billing shape to the workload.Steady, human-in-the-loop use favors a flat subscription. Spiky, high-volume, or agent-driven loops favor metered credits with hard spend controls. The expensive mistake is standardizing on one shape for every workload.
01 — The SplitThe week AI pricing split in two.
Most coverage treated these as two unrelated stories: an Anthropic pricing change and an OpenAI model preview. Read side by side, they are the same argument about how AI should be billed in the second half of 2026. Anthropic is pushing its most capable coding model behind a meter. OpenAI is holding the line that its coding agent stays inside the subscription, with credits as an optional pressure-release valve — not a toll gate.
Neither position is charity. Anthropic's meter follows a clear commercial logic (and, as we cover below, a pattern that started with its enterprise seats). OpenAI's included stance is a land-grab: keep the subscription simple, keep adoption frictionless, and let the heaviest users buy their way past the caps. The two cards below frame the bet each lab is making.
Codex stays in the subscription
Per OpenAI's help center, Codex is included across every ChatGPT plan. Credit packs exist only to unblock users who exceed their limits; otherwise you wait for the reset. The message: keep subscription prices low for most, let heavy users go wild.
Fable 5 moves to usage credits
Fable 5 is included on Pro, Max, and Team for up to 50% of weekly limits until July 7, then access requires enabling usage credits. Standard Enterprise seats are metered from day one. Direct API rate: $10 / $50 per million input / output tokens.
02 — Anthropic's MoveAnthropic moves Fable 5 to metered credits.
Anthropic redeployed Fable 5 globally on July 1, 2026, and attached a billing change to the return. Through July 7, Fable 5 is included on Pro, Max, and Team plans for up to 50% of each plan's weekly usage limits. After that date, Fable 5 access requires enabling usage credits — and if credits are not enabled, Fable 5 stops working entirely for that organization. Premium Enterprise seats get the same grace window; standard Enterprise seats get none, with all Fable 5 usage billed through credits from day one.
Anthropic has not published a credit-to-tasks conversion, so the only concrete number available is the direct API rate: $10 per million input tokens and $50 per million output tokens (as surfaced in community usage-credit guides on July 1). Treat any specific dollars-per-task claim you see elsewhere as invented until Anthropic publishes one.
Fable 5 per-token list price
Plus $50 per million output tokens. This is the only hard figure Anthropic has attached to Fable 5 usage — there is no published credit-to-task conversion as of July 1, 2026.
Of weekly limits, through Jul 7
Pro, Max, and Team plans get Fable 5 included for up to half of each plan's weekly usage limits — until July 7. After that, an org without usage credits enabled loses Fable 5 access.
Included Fable 5 allowance
Standard Enterprise seats get zero grace window — every Fable 5 request is billed through usage credits immediately. Premium Enterprise seats keep the same July 7 grace as Pro/Max/Team.
This is not a one-off. It is at least the second billing move Anthropic has aimed at programmatic and agentic usage inside roughly three weeks. On June 15, Anthropic proposed a separate Agent SDK credit — sized to each plan's fee, billed at API rates, per-user and non-pooled — before shelving it ahead of the effective date. We covered that reversal in detail in Anthropic's shelved June 15 Agent SDK credit split. The through-line is unmistakable: Anthropic keeps testing where it can put a meter on automated consumption.
The enterprise side moved first. Per The Register, Anthropic began renewing enterprise contracts onto a usage-based plan from late 2025; by early 2026 the standard became a single seat around $20 per employee per month with the old bundled-token discounts removed. As IntuitionLabs CEO Adrien Laurent told The Register, for some of his firm's clients the base seat was already only about 20 percent of the total bill, with the other 80 percent metered API usage — so for heavy users, the meter was already the reality. Metering Fable 5 for consumers simply extends a shape enterprises had been living with for months.
03 — OpenAI's BetOpenAI keeps Codex included — on purpose.
OpenAI's countermove is documentation, not a launch. Its help center states plainly that Codex is included across every ChatGPT plan — Free, Go, Plus, Pro, Business, Edu, and Enterprise — and that credit packs are optional, needed only when a workspace owner wants to unblock users who exceed their limits. Everyone else waits for the limit to reset. On April 2, 2026, OpenAI did align Codex pricing to API-token usage rather than a flat per-message model for Plus, Pro, Business, and new Enterprise plans — but the subscription itself still carries the included allowance.
The philosophy is stated most directly by Sam Altman, describing why overage credits exist at all. It is a deliberate design choice: keep the base subscription cheap and predictable for the majority, and let a small set of power users pay to go past the caps.
“If you want to use more Codex after you hit your subscription limits, you can now buy credits as needed. This is something we expect to do for compute-intensive features; it will let us keep subscription prices low for most users and let the rest of you go wild.”— Sam Altman, CEO, OpenAI (on X)
The nuance that matters for a budget owner is opt-in versus mandatory. Under OpenAI's model, the default state is included; a user who does nothing simply hits a ceiling and resumes after a reset. Under Anthropic's post-July-7 model for Fable 5, the default state for continued access is metered; an org that does nothing loses the model. Same word — credits — opposite defaults.
OpenAI's own frontier story sits behind a different wall entirely. It previewed GPT-5.6 (Sol, Terra, and Luna) on June 26 as a government-gated limited preview — available through the API and Codex only, to roughly 20 government-approved companies, and not in the ChatGPT consumer app at preview stage. OpenAI has said it expects to widen access and reach a broad release “in the coming weeks,” but has not committed to a date, so treat any specific general-availability date you see as speculation. For the preview mechanics, see GPT-5.6's government-gated preview. In the meantime, the model still doing the daily work in most stacks is GPT-5.5, which stays broadly available while Sol remains gated.
04 — Efficiency EraThe tokenmaxxing era hits its ceiling.
Both pricing bets land into the same macro shift. CNBC reported on June 26 that the industry is moving from “tokenmaxxing” — maximizing token consumption to squeeze out capability — toward efficiency, as enterprise buyers start rationing spend. The evidence is concrete. Per CNBC, Uber implemented AI-tool spending tiers this month starting at a $1,500-per-employee-per-month base (with an option to request more), after its CTO told The Information the company had blown through its entire annual AI budget in just four months. AI startup Lindy moved 100% of its model traffic off Claude to a cheaper model to control costs, with CEO Flo Crivello framing it as a matter of survival for the business.
Per employee, per month base
Uber introduced AI-tool spend tiers this month at a $1,500/employee base, per CNBC — after its CTO told The Information the company exhausted its annual AI budget in four months. Attribute to Uber/The Information via CNBC.
Of model traffic moved off Claude
The ~25-person startup shifted all model traffic to a cheaper model to control its cost curve; CEO Flo Crivello expects it to save millions within months, though he still expects AI spend to exceed payroll.
Annualized, ~May 2026
Up from roughly $9–10 billion in full-year 2025 revenue, per CNBC; Q2 2026 was Anthropic's first profitable quarter. OpenAI's run rate paced closer to ~$25B over the same period, per The Information via CNBC.
Here is the interpretation the raw numbers imply. When a customer's largest single variable cost is tokens, and that cost can quadruple in a quarter, the customer stops caring which model is marginally smartest and starts caring which billing shape it can forecast. That reframes the whole vendor relationship. Anthropic's run-rate surge to roughly $47 billion annualized (from a ~$9–10 billion 2025 base, per CNBC) is exactly the kind of growth that cannot continue if buyers start capping spend — which is the analyst read on why both labs are racing to go public.
05 — Pricing MatrixIncluded vs metered, tier by tier.
No single published source lines up all three major vendors' tier-by-tier billing philosophy with the opt-in nuance attached. This is that view, as of July 1, 2026. Read the Overage column as the tell: “optional” means you can ignore it, “required” means the model stops without it, and “not offered” means access is bundled with no à la carte path.
| Tier | Base fee | Frontier access included | Overage | Direction (2026) |
|---|---|---|---|---|
| OpenAI — ChatGPT | ||||
| Free | $0 | Codex included (tight limits) | Credits optional | More included |
| Go | $8 / mo | Codex included | Credits optional | More included |
| Plus | $20 / mo | Codex included | Credits optional | More included |
| Pro | $100 / $200 mo | Codex included (higher caps) | Credits optional | More included |
| Business | ~$20–25 / seat* | Codex included | Credits optional | More included |
| Enterprise | Custom | Codex included | Credits + spend controls | More included |
| Anthropic — Claude | ||||
| Free | $0 | No Fable 5 | Not offered | More metered |
| Pro | $20 / mo | Fable 5 incl. to Jul 7 | Credits required after | More metered |
| Max 5× | $100 / mo | Fable 5 incl. to Jul 7 | Credits required after | More metered |
| Max 20× | $200 / mo | Fable 5 incl. to Jul 7 | Credits required after | More metered |
| Team | Per seat | Fable 5 incl. to Jul 7 | Credits required after | More metered |
| Enterprise (standard) | ~$20 / employee mo | No included Fable 5 | Credits from day one | More metered |
| Enterprise (premium) | Custom | Fable 5 incl. to Jul 7 | Credits after grace | More metered |
| Google — Gemini* | ||||
| AI Plus | ~$7.99 / mo* | Bundled (compute-metered) | No à la carte overage | Bundled, compute-metered |
| AI Pro | ~$19.99 / mo* | Bundled (compute-metered) | No à la carte overage | Bundled, compute-metered |
| AI Ultra | ~$99.99 / mo* (cut from ~$249.99) | Bundled (compute-metered) | No à la carte overage | Bundled, compute-metered |
Read the asterisks honestly. ChatGPT Business's seat price and its reported April 2026 cut come from pricing trackers rather than a primary confirmation in our research pass, and every Google Gemini figure here is aggregator-sourced (9to5Google, Finout) rather than confirmed against Google's own page this week — including the AI Ultra entry price reportedly cut from around $249.99 to about $99.99 per month. The direction of travel, though, is well corroborated across sources: OpenAI leaning included, Anthropic leaning metered, and Google keeping access bundled while quietly debiting a “compute-used” budget behind the subscription.
06 — Rate CardWhat a million tokens actually costs.
Once you are metered, the meter's shape matters. The table below takes the published per-token list rates and adds a single computed column: a blended cost per million tokens at a 3:1 input-to-output mix, defined as (3 × input + 1 × output) ÷ 4. That mix is a deliberately conservative stand-in for chat and coding workloads, where you read far more context than you generate. This is a billing-math illustration, not a capability comparison — Fable 5 is a flagship and GPT-5.6 is a government-gated preview, so read the spread as the shape of each meter, not as model parity.
| Model | Input / 1M | Output / 1M | Blended / 1M (3:1) |
|---|---|---|---|
| Anthropic — flagship, direct API | |||
| Claude Fable 5 | $10.00 | $50.00 | $20.00 |
| OpenAI GPT-5.6 — government-gated preview (API + Codex only) | |||
| GPT-5.6 Sol | $5.00 | $30.00 | $11.25 |
| GPT-5.6 Terra | $2.50 | $15.00 | $5.63 |
| GPT-5.6 Luna | $1.00 | $6.00 | $2.25 |
The arithmetic is worth internalizing because it explains the behavior. Output tokens dominate the blended cost — Fable 5's $50 output rate pulls its blended figure to $20.00 even though its input is only $10.00. That is why an agent that generates long completions, reasoning traces, or repeated tool calls burns credits far faster than its input volume suggests. The GPT-5.6 preview tiers land at $11.25 (Sol), $5.63 (Terra), and $2.25 (Luna) on the same formula — preview list rates, sourced to Axios and MarkTechPost, and gated behind government-approved access. When you budget a metered workload, budget the output side first.
07 — Decision FrameworkWhich billing shape fits which workload.
The vendor is the wrong first question. The right first question is how your token spend behaves — how predictable it is, and how much of it a machine drives without a human in the loop. Map the workload to a billing posture first; the vendor choice falls out of that. Five archetypes cover most of what agencies and product teams actually run.
Steady, human-paced usage
A person types, reads, and thinks between prompts, so consumption is naturally bounded. A flat subscription is almost always cheaper and simpler than a meter here — the included allowance rarely runs dry. This is the case where OpenAI's included stance and Anthropic's pre-July-7 window both work fine.
IDE assistance, one developer
Bursty but still human-gated. Stay subscription-first and watch the weekly limits; only enable credits if a specific sprint genuinely needs to burst past caps. On Anthropic, note the July 7 cutoff — decide before then whether Fable 5 is worth enabling credits for.
CI automation & headless runs
Machine-driven, high-output, runs without a person watching the meter. This is exactly the workload Anthropic's shelved June 15 Agent SDK credit targeted — audit it before Anthropic revisits the idea. Budget the output side, set hard spend controls, and treat credits as a real line item.
High-volume, variable traffic
Spend scales with demand you don't fully control, so a flat subscription either caps you or overpays. A metered-committed posture with per-workspace spend limits and model routing (cheaper models for easy turns) matches the cost to the revenue it drives.
Multi-team, mixed workloads
You will run several of the above at once, so standardize on the tooling, not a single shape: org-level spend controls, per-team budgets, and a fallback model. Anthropic's enterprise seats are already metered; plan for the meter and instrument it before scale, not after the first surprise bill.
08 — The PlaybookWhat a budget-owner should do now.
The immediate action is a calendar item: if any part of your stack relies on Fable 5, decide before July 7 whether you enable usage credits or fall back to Opus 4.8 (which is also where blocked Fable 5 requests auto-reroute). Enterprises on standard seats are already metered, so the decision there is about limits, not access. Either way, turn on the spend-control and analytics tooling both vendors now ship — you cannot govern a variable cost you cannot see per team.
Looking forward, expect the meter to keep spreading, not retreating. Anthropic has now tested metered programmatic usage twice in a month; Google already debits a compute budget behind its subscriptions; and Microsoft has begun pitching lower-cost in-house models explicitly to reduce dependence on both labs, with GitHub Copilot auto-routing coding tasks to the most cost-appropriate model. The defensible posture is not loyalty to a vendor's pricing promise — OpenAI's included stance is a strategy, not a covenant — but architecture that can move: a documented fallback model, task-based routing, and a budget you re-check monthly. For the mechanics of routing and committing spend, our breakdown of token-based vs outcome-based agent pricing pairs with a token-budget calculator built for exactly this decision.
This is also where an outside pass earns its keep. When we scope a client's AI budget, we start by classifying each workload into the archetypes above, instrument the meter, and set spend controls before anything goes to production — the same way we scope agentic AI budgets in our AI transformation engagements. The goal is boring on purpose: no surprise invoices, and a billing shape you chose deliberately rather than inherited from whichever vendor changed its policy last.
09 — ConclusionMatch the meter to the workload.
The vendor you pick matters less than the billing shape you can plan around.
One week produced two opposite bets: Anthropic put its best coding model behind a meter after July 7, and OpenAI held Codex inside the subscription with credits kept optional. Both bets are commercial, not moral — and both land into an efficiency era where enterprises are actively capping AI spend for the first time.
The practical response is not to crown a winner. It is to classify your workloads by how predictable and how machine-driven their token spend is, then match each to a billing posture — subscription for steady human-paced use, metered-committed with spend controls for spiky agent loops. Get that mapping right and the vendor's next policy change becomes a routing decision, not a fire drill.
The signal underneath is the durable one: as long as the largest variable cost in an AI product is tokens, billing shape is a first-class architectural decision, not a procurement afterthought. The teams that treat it that way — a documented fallback, task-based routing, a monthly budget review — are the ones that will keep shipping when the next meter turns on.