Choosing a Claude model in July 2026 is no longer a single-axis decision. Anthropic ships three frontier tiers at once — Sonnet 5, Opus 4.8, and Fable 5 — and Sonnet 5’s headline $2/$10 introductory price makes it look like the automatic default. It often is. But the honest answer is that the cheapest per-token model is not always the cheapest per-task model, and the gap turns on one lever most teams never touch on purpose: effort.

Sonnet 5 launched on June 30, 2026 as, in Anthropic’s framing, the most agentic Sonnet model yet. Its per-token price is genuinely low — identical to Sonnet 4.6, and roughly half of Opus 4.8. Yet Anthropic’s own tokenizer produces about 30% more tokens for the same text, and independent benchmarking lab Artificial Analysis measured Sonnet 5 at a higher cost per task than Opus 4.8 on their standard-pricing index. Both things are true at once, and the reconciliation is the whole point of this guide.

Below we lay out the three-tier lineup, recompute the pricing math from Anthropic’s published rates, and turn it into two operational tools: an effort-level breakeven matrix and a three-model decision matrix you can act on. Our earlier Sonnet 5 launch coverage framed it as near-Opus quality at roughly half the price; this piece adds the fuller picture — the discount narrows, and can invert, as effort climbs.

Key takeaways

01
The intro price is real, but Anthropic calls it cost-neutral.Sonnet 5's $2/$10 introductory rate runs through August 31, 2026, then reverts to $3/$15. Anthropic itself describes the intro pricing as roughly cost-neutral versus Sonnet 4.6 — it offsets the tokenizer's extra tokens rather than being a straight price cut.
02
The cost inversion only bites at high and max effort.At low and medium effort, Sonnet 5 genuinely saves money against Opus 4.8. Push it to xhigh or max and it can cost more per task for comparable quality — Artificial Analysis measured about 15% more at blended effort on standard pricing. Do not read this as Sonnet 5 being pricier everywhere.
03
Effort level, not model choice, is the bigger cost lever.Artificial Analysis found max effort uses roughly six times the conversational turns of low effort on GDPval-AA. That swing is larger than the Sonnet-to-Opus price gap. Set the effort dial deliberately before you argue about which model to route to.
04
Opus 4.8 owns correctness-critical, max-effort work.Opus 4.8 leads coding benchmarks (SWE-bench Pro 69.2% vs Sonnet 5's 63.2%) and, at high and max effort, comes in cheaper per task in practice. Its Dynamic Workflows fan out hundreds of parallel subagents for hundred-thousand-line refactors.
05
Fable 5 earns its 2x premium only on long-horizon work.At $10/$50, Fable 5 costs double Opus 4.8. Anthropic says its lead grows with task length and complexity, so reserve it for multi-day planning and reasoning where the horizon itself is the differentiator — not routine agentic coding throughput.

01 — The LineupThree models, three price tiers.

For the first time, Anthropic’s frontier is a menu, not a single flagship. Sonnet 5 is the balanced default, Opus 4.8 is the correctness workhorse, and Fable 5 is the general-access, safety- classified version of the Mythos-tier capability. They arrived within five weeks of each other, and their prices span a 5x range from the Sonnet 5 intro rate to Fable 5.

Opus 4.8 shipped on May 28, 2026 with pricing unchanged from Opus 4.7 and a new capability Anthropic calls Dynamic Workflows — Claude plans a large task, fans out potentially hundreds of parallel subagents, and verifies their output against a test suite. Fable 5, from its split-tier launch on June 9, 2026, is the publicly available half of that release.

Balanced default

Claude Sonnet 5

$2/$10 intro · $3/$15 standard

The most agentic Sonnet yet, per Anthropic — plans, uses browsers and terminals, runs autonomously. Same per-token price as Sonnet 4.6. Introductory rate runs through Aug 31, 2026, then standard pricing applies.

Launched Jun 30, 2026

Correctness tier

Claude Opus 4.8

$5/$25 · Fast Mode $10/$50 (~2.5x)

Anthropic's correctness workhorse, pricing unchanged from Opus 4.7. Dynamic Workflows fan out hundreds of parallel subagents and verify against a test suite — built for hundred-thousand-line refactors, not routine single-agent tasks.

Launched May 28, 2026

Frontier tier

Claude Fable 5

$10/$50 · double Opus 4.8

The general-access, safety-classified version of Anthropic's Mythos-tier capability. Anthropic says the longer and more complex the task, the larger its lead over the other models — so its edge is long-horizon, not throughput.

Launched Jun 9, 2026

Anthropic's own positioning

Anthropic frames Sonnet 5 as closing the gap rather than matching the top: “Sonnet 5 narrows the gap: its performance is close to that of Opus 4.8, but at lower prices.” That is the marketing claim to test — close, at lower per-token prices. Whether it is cheaper per finished task depends on the two variables the rest of this guide unpacks: token expansion and effort.

02 — PricingThe pricing table, with the math recomputed.

Start with the published per-million-token rates and work a single, transparent task through each model. The illustrative task below is one million input tokens plus two hundred thousand output tokens, a single pass with no caching, priced at each model’s list rate. The task-cost and ratio columns are our arithmetic, not a vendor figure — recompute them yourself from the row rates.

Per-million-token input and output prices for Claude Sonnet 5, Opus 4.8, and Fable 5, with an illustrative task cost (1,000,000 input tokens plus 200,000 output tokens, single pass, no caching) and that cost as a ratio of Opus 4.8’s $10.00. Prices from Anthropic’s pricing page, retrieved July 1, 2026; task-cost and ratio columns are Digital Applied arithmetic.
Model	Input $/M	Output $/M	Illustrative task*	vs Opus 4.8
Introductory window — Sonnet 5 only, through Aug 31, 2026
Sonnet 5 (intro)	$2	$10	$4.00	40%
Standard list pricing — from Sep 1, 2026
Sonnet 5 (standard)	$3	$15	$6.00	60%
Opus 4.8	$5	$25	$10.00	baseline
Fable 5	$10	$50	$20.00	200%

* Illustrative task = 1M input + 200K output tokens, single pass, no caching. Note the assumption baked in: equal token counts across models. That is exactly where the comparison gets interesting, because real Sonnet 5 workloads do not emit equal token counts.

On raw per-token price, Sonnet 5 is unambiguously cheaper: $6.00 at standard pricing is 60% of Opus 4.8’s $10.00 for the same tokens, and $4.00 during the intro window is just 40%. But Anthropic’s updated tokenizer produces about 30% more tokens for the same text than Sonnet 4.6’s did — a change that also applies to Opus 4.7 and up, Fable 5, and the Mythos models. More tokens per unit of work quietly erodes the per-token advantage.

Read the fine print

Anthropic states the introductory pricing is calibrated so that migrating from Sonnet 4.6 is roughly cost-neutral despite the extra tokens. In other words, the $2/$10 headline is not a straight discount — it is compensating for the tokenizer’s token expansion. Secondary coverage puts that expansion at 1.0 to 1.35x depending on content type; Anthropic’s headline ~30% sits in the middle of that range, so treat the range as the mechanism behind the single figure, not a contradiction.

03 — Cost InversionWhere the cheaper model becomes the pricier one.

Artificial Analysis, an independent benchmarking lab, ran the three models through its Intelligence Index at standard (non-promotional) pricing and measured cost per task rather than cost per token. On that basis, Sonnet 5 came in at $2.29 per task versus roughly $1.99 for Opus 4.8 — about 15% higher, despite Sonnet 5’s lower sticker price. The driver was token usage: per the lab, Sonnet 5 used roughly 40% more output tokens per Index task than Sonnet 4.6, and on GDPval-AA it took about three times as many conversational turns. These are Artificial Analysis’s own measurements on their methodology, not a universal cost of running Sonnet 5.

Cost per task · Artificial Analysis Intelligence Index

Source: Artificial Analysis Intelligence Index, June 30, 2026 · standard (post-intro) pricing

Sonnet 4.6Prior-generation Sonnet · reference point

$1.15

Opus 4.8Correctness tier · blended-effort task

$1.99

Sonnet 5Lower per-token price, more tokens per task

$2.29

The crucial nuance — and the reason this is not a simple gotcha — is that the $2.29 figure is a blended-effort average at standard pricing. It does not mean Sonnet 5 is always dearer. During the intro window the per-token gap is wider, and at low and medium effort the token expansion is small enough that Sonnet 5 stays genuinely cheaper. The inversion is an effort-dependent phenomenon, not a fixed property of the model.

Here is the part almost no launch coverage operationalizes: the same lab found max effort uses roughly six times the conversational turns of low effort on GDPval-AA. That six-fold swing dwarfs the roughly two-fold difference in per-token price between Sonnet 5 and Opus 4.8. The practical read is that effort level is the primary cost dial and model choice is the secondary one — most teams have it backwards, agonizing over which model to route to while leaving effort at a reflexive default.

"With Claude Sonnet 5, agents stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost."— Sualeh Asif, Co-founder, Cursor

04 — BenchmarksCapability gap vs price gap, side by side.

The capability ordering is clear and consistent: on the headline agentic-coding benchmark, SWE-bench Pro, Opus 4.8 leads Sonnet 5, which in turn clears Sonnet 4.6 by a comfortable margin. Fable 5 sits highest, though its figure is Anthropic-reported rather than independently audited. The question is whether that capability gap is worth the price gap for your workload — and on many tasks it is not.

SWE-bench Pro · agentic coding accuracy

Source: Anthropic disclosures via VentureBeat & MarkTechPost; Fable 5 figure vendor-stated

Sonnet 4.6Prior generation · agentic coding

58.1%

Sonnet 5New default · +5.1 pts over 4.6

63.2%

Opus 4.8Correctness tier · +6.0 pts over Sonnet 5

69.2%

Fable 5Anthropic-reported (vendor-stated)

80.3%

Widen the lens beyond a single benchmark and the Sonnet-5-to-Opus-4.8 gap narrows to near-noise on several axes. On agentic knowledge work, they are effectively tied.

Terminal-Bench 2.1

Opus 4.8 leads narrowly

82.7

Opus 4.8 82.7% vs Sonnet 5 80.4% — and both leap over Sonnet 4.6's 67.0%. Sonnet 5's jump on its predecessor is the bigger story than the gap to Opus.

Sonnet 5: 80.4%

GDPval-AA v2

Sonnet 5 edges Opus

1,618

Sonnet 5 1,618 vs Opus 4.8 1,615 on the v2 scale — a statistical tie for agentic knowledge work. Do not compare these to older GDPval-AA numbers from earlier launches.

Opus 4.8: 1,615

Humanity's Last Exam

Near-tie, with tools

57.9

Opus 4.8 57.9% vs Sonnet 5 57.4% with tools — inside the margin of noise. On raw knowledge under tool use, the two are effectively level.

Sonnet 5: 57.4%

The takeaway from the benchmark spread is not that one model wins. It is that Opus 4.8’s advantage is real but concentrated — largest on the hardest coding, negligible on knowledge work where Sonnet 5 sometimes edges ahead. That shape is exactly what makes a per-workload routing strategy pay off rather than a single default.

05 — Effort MatrixThe effort-level breakeven matrix.

Sonnet 5 and Opus 4.8 share the same five effort levels — low, medium, high, xhigh, and max — replacing the older manual budget_tokens parameter, which now returns a 400 error on both. That shared dial is what makes a clean crossover analysis possible. The matrix below reads left to right: as effort climbs, the cheaper-in-practice model flips from Sonnet 5 to Opus 4.8. The turn-count column is anchored to Artificial Analysis’s GDPval-AA data; the intermediate cells are qualitative because no source publishes a per-tier dollar figure.

For each Sonnet 5 / Opus 4.8 effort level, the model that is cheaper in practice, the relative conversational-turn cost versus low effort, the best-fit work, and the rationale. Turn multipliers (low baseline to about six times at max) are Artificial Analysis GDPval-AA measurements; model recommendations synthesise Artificial Analysis cost data and MarkTechPost effort-band analysis. Digital Applied framework, July 1, 2026.
Effort	Cheaper in practice	Turns vs low	Best-fit work	Why
Low	Sonnet 5	baseline	Triage, classification, routing, simple extraction	Per-token discount dominates; adaptive thinking stays light.
Medium	Sonnet 5	modestly higher	Day-to-day coding, multi-step automation, structured retrieval	Independent analysis puts low and medium effort in Sonnet 5’s favor over Opus 4.8.
High	Benchmark it	materially higher	Harder multi-file changes, thorough review	The crossover zone — token growth starts eating the per-token gap. Measure on your own prompts.
xHigh	Opus 4.8	high	Correctness-critical coding	Reported that at xhigh, Sonnet 5 cost can exceed Opus 4.8 for comparable quality.
Max	Opus 4.8	up to ~6×	Hardest refactors, Dynamic Workflows	Effort, not model, is the dominant cost lever here; Opus 4.8’s max-effort quality edge justifies it.

How to use this matrix

The honest, citable version of the whole argument comes from independent trade-press analysis: best value zones — low/medium effort tasks favor Sonnet 5 over Opus 4.8; at xhigh effort, Sonnet 5 costs can exceed Opus 4.8 for comparable quality. Set your effort level first, then read the row. If you are pinning everything to a single model and a single effort default, you are almost certainly overpaying on one end of the workload spectrum.

06 — Decision MatrixThe three-model decision matrix.

Fold price, benchmarks, and effort together and the guidance collapses into four archetypes. Each maps a workload shape to a model and an effort band. Note that we deliberately leave Fable 5’s effort configuration unspecified — Anthropic has not confirmed it shares the exact five-level dial, so we position Fable 5 by task type rather than by effort tier.

High-volume agents

Triage, classification, routing

Support-ticket triage, lead scoring, document classification, simple extraction at scale. Effort stays low, tokens stay lean, and Sonnet 5's per-token discount compounds across millions of calls.

Sonnet 5 · low effort

Day-to-day coding

Everyday builds at low/medium effort

Feature work, refactors of moderate scope, multi-step automations. This is the sweet spot where Sonnet 5 is both capable enough and genuinely cheaper than Opus 4.8 — the majority of production agent work lives here.

Sonnet 5 · medium effort

Correctness-critical

Max-effort production refactors

Hundred-thousand-line refactors, migrations that must not regress, anything where a wrong answer is expensive. Opus 4.8 leads the coding benchmarks and, at high/max effort, comes in cheaper per task — pair it with Dynamic Workflows.

Opus 4.8 · high/max

Long-horizon planning

Multi-day reasoning & strategy

Anthropic says Fable 5's lead over its other models grows with task length and complexity. Reserve the 2x price for genuinely long-horizon planning where the horizon itself is the differentiator — not routine throughput.

Fable 5

If your work sits at the boundary between coding and pure frontier reasoning, it is worth reading how the top tier compares across vendors — our breakdown of how Fable 5 stacks up against GPT-5.5 covers the cost trade at the frontier, where the per-task math looks different again.

07 — Watch-OutsComparison landmines to avoid.

A few traps recur in the coverage of these launches. Two are worth surfacing because they can quietly invalidate a comparison you build in good faith — one about benchmark scales, one about safety behavior that changes what a model will even answer.

Landmine 1 · GDPval-AA scale confusion

Opus 4.8’s GDPval-AA score appears as roughly 1,890 in its May 28 launch coverage and as 1,615 in the June 30 Sonnet 5 comparison. Those are different scales — the later figure is labelled GDPval-AA v2, an apparent rescaling. The exact rescaling methodology has not been published, so never plot the v2 numbers (Sonnet 5 1,618, Opus 4.8 1,615) on the same axis as the earlier GDPval-AA figures from the Opus 4.8 or Fable 5 launches. When a score jumps or drops across launch dates, check whether the benchmark itself was revised before you draw a conclusion.

Landmine 2 · safety classifiers change the answer set

Sonnet 5 is the first Sonnet-tier model with real-time cybersecurity safeguards: requests touching prohibited or high-risk cyber topics may be refused, returned as an HTTP 200 with stop_reason: "refusal" rather than an error. Fable 5 goes further, shipping classifiers for cybersecurity, biology and chemistry, and distillation — when a request trips one, it is automatically answered by Opus 4.8 instead and the user is told. Anthropic reports this fallback happens in fewer than 5% of sessions, but if your evaluation set brushes those topics, you may be silently benchmarking a different model than you think.

"A model that knows when to say no is just as important as one that knows how to build."— Fabian Hedin, Lovable

08 — In ProductionRouting all three in production.

The operational conclusion is not to standardize on one model. It is to route by workload and to treat effort as a first-class budget control. A pragmatic default stack looks like this — Sonnet 5 carries the volume, Opus 4.8 handles the hard correctness work, and Fable 5 is held in reserve for genuinely long-horizon tasks.

Default agent

Sonnet 5 · low/medium

High-volume triage, classification, day-to-day automation and coding. The per-token discount dominates while effort stays low — this is where most calls should land.

cheapest at scale

Correctness

Opus 4.8 · high/max

O4.8

Correctness-critical refactors and Dynamic Workflows. At high and max effort it comes in cheaper per task than Sonnet 5 for comparable quality, per Artificial Analysis's blended measurement.

~15% less at blended effort

Long-horizon

Fable 5 · reserve

Multi-day planning where task length itself is the differentiator. Anthropic says its lead grows with complexity — worth double the price only when the horizon justifies it.

2× Opus price

Two guardrails make this stack hold up. First, batch anything latency-tolerant: the Batch API halves the rate, so Sonnet 5 intro batch runs at $1/$5 per million and Opus 4.8 batch at $2.50/$12.50 — a straight 50% cut on the numbers above. Second, lean on prompt caching for repeated context; a cache read is billed at one-tenth of base input across all three models, which can dominate the economics of any agent that re-reads a large system prompt or codebase.

The larger pattern to watch through the back half of 2026: Anthropic has stopped shipping a single flagship and started shipping a price- segmented ladder, with the tokenizer and effort dial doing more to determine real cost than the sticker price. Teams that build routing and effort control into their agent infrastructure now will adapt to the next tier reshuffle without re-plumbing; teams that hardcode a single model and effort default will keep overpaying every time the ladder shifts. If you want a second pair of hands on that routing logic, our agentic AI transformation engagements start with exactly this kind of per-workload model and effort eval.

09 — ConclusionPick per workload, not per headline.

Conclusion · Picking a Claude tier

The cheap model is only cheap until you turn up the effort.

Sonnet 5 is the right default for most agent work, and at low and medium effort it genuinely saves money against Opus 4.8. That is the honest headline, and it should not get lost in the more surprising finding that at high and max effort the per-task cost can invert. Both are true because effort, not model choice, is the dominant cost lever — a six-fold swing in conversational turns from low to max dwarfs the two-fold gap in per-token price.

So the decision is not “which model is best.” It is a two-step routine: set the effort level the task actually needs, then pick the model that is cheaper in practice at that level. High-volume, low-effort work goes to Sonnet 5; correctness-critical, max-effort work goes to Opus 4.8; genuinely long-horizon planning is where Fable 5’s premium pays for itself. Treat the $2.29-versus-$1.99 comparison as one lab’s blended-effort snapshot, not a verdict — your own workload mix is the only benchmark that settles it.

The broader signal is that Anthropic’s frontier is now a price-segmented ladder rather than a single flagship, and the levers that move real cost — token expansion and effort — sit below the sticker price where most buyers never look. The teams that win the next year of agent economics will be the ones who measure cost per finished task on their own prompts and build effort control into the stack, not the ones who chase the lowest number on the pricing page.

Sonnet 5 vs Opus 4.8 vs Fable 5: when to use which

01 — The LineupThree models, three price tiers.

Claude Sonnet 5

Claude Opus 4.8

Claude Fable 5

02 — PricingThe pricing table, with the math recomputed.

03 — Cost InversionWhere the cheaper model becomes the pricier one.

Cost per task · Artificial Analysis Intelligence Index

04 — BenchmarksCapability gap vs price gap, side by side.

SWE-bench Pro · agentic coding accuracy

Opus 4.8 leads narrowly

Sonnet 5 edges Opus

Near-tie, with tools

05 — Effort MatrixThe effort-level breakeven matrix.

06 — Decision MatrixThe three-model decision matrix.

Triage, classification, routing

Everyday builds at low/medium effort

Max-effort production refactors

Multi-day reasoning & strategy

07 — Watch-OutsComparison landmines to avoid.

08 — In ProductionRouting all three in production.

Sonnet 5 · low/medium

Opus 4.8 · high/max

Fable 5 · reserve

09 — ConclusionPick per workload, not per headline.

The cheap model is only cheap until you turn up the effort.

Cost per finished task, not sticker price, is the metric that actually decides.

Model-routing engagements

The questions we get every week.

Continue exploring frontier releases.

Claude Fable 5 & Mythos 5: Agentic Coding Deep Dive

Claude Fable 5 vs GPT-5.5: Benchmarks & Cost Compared

Claude Fable 5 & Mythos 5: The Frontier, Split in Two

GPT-5.5 vs Claude Opus 4.7: Benchmarks & Pricing

Agent Computer Use: Enterprise Automation Playbook

State of AI Agents 2026: 200+ Data Points Compiled