AI DevelopmentDecision Matrix11 min readPublished July 1, 2026

Three models · three price tiers · effort, not model, is the hidden cost lever

Sonnet 5 vs Opus 4.8 vs Fable 5: when to use which

Anthropic now sells three frontier tiers at once — Sonnet 5 at a $2/$10 introductory price, Opus 4.8 at $5/$25, and Fable 5 at $10/$50. Sonnet 5 looks like the obvious default until you account for the tokenizer and the effort dial. This is the decision matrix for picking one per workload.

DA
Digital Applied Team
Senior strategists · Published Jul 1, 2026
PublishedJuly 1, 2026
Read time11 min
Sources10 cited
Sonnet 5 intro price
$2/$10
per M in/out · thru Aug 31
then $3/$15
Tokenizer tax
~30%
more tokens, same text
Anthropic-stated
Sonnet 5 $/task
$2.29
vs Opus 4.8 ~$1.99 · AA index
~15% higher
Effort span
6×
max vs low turns · GDPval-AA

Choosing a Claude model in July 2026 is no longer a single-axis decision. Anthropic ships three frontier tiers at once — Sonnet 5, Opus 4.8, and Fable 5 — and Sonnet 5’s headline $2/$10 introductory price makes it look like the automatic default. It often is. But the honest answer is that the cheapest per-token model is not always the cheapest per-task model, and the gap turns on one lever most teams never touch on purpose: effort.

Sonnet 5 launched on June 30, 2026 as, in Anthropic’s framing, the most agentic Sonnet model yet. Its per-token price is genuinely low — identical to Sonnet 4.6, and roughly half of Opus 4.8. Yet Anthropic’s own tokenizer produces about 30% more tokens for the same text, and independent benchmarking lab Artificial Analysis measured Sonnet 5 at a higher cost per task than Opus 4.8 on their standard-pricing index. Both things are true at once, and the reconciliation is the whole point of this guide.

Below we lay out the three-tier lineup, recompute the pricing math from Anthropic’s published rates, and turn it into two operational tools: an effort-level breakeven matrix and a three-model decision matrix you can act on. Our earlier Sonnet 5 launch coverage framed it as near-Opus quality at roughly half the price; this piece adds the fuller picture — the discount narrows, and can invert, as effort climbs.

Key takeaways
  1. 01
    The intro price is real, but Anthropic calls it cost-neutral.Sonnet 5's $2/$10 introductory rate runs through August 31, 2026, then reverts to $3/$15. Anthropic itself describes the intro pricing as roughly cost-neutral versus Sonnet 4.6 — it offsets the tokenizer's extra tokens rather than being a straight price cut.
  2. 02
    The cost inversion only bites at high and max effort.At low and medium effort, Sonnet 5 genuinely saves money against Opus 4.8. Push it to xhigh or max and it can cost more per task for comparable quality — Artificial Analysis measured about 15% more at blended effort on standard pricing. Do not read this as Sonnet 5 being pricier everywhere.
  3. 03
    Effort level, not model choice, is the bigger cost lever.Artificial Analysis found max effort uses roughly six times the conversational turns of low effort on GDPval-AA. That swing is larger than the Sonnet-to-Opus price gap. Set the effort dial deliberately before you argue about which model to route to.
  4. 04
    Opus 4.8 owns correctness-critical, max-effort work.Opus 4.8 leads coding benchmarks (SWE-bench Pro 69.2% vs Sonnet 5's 63.2%) and, at high and max effort, comes in cheaper per task in practice. Its Dynamic Workflows fan out hundreds of parallel subagents for hundred-thousand-line refactors.
  5. 05
    Fable 5 earns its 2x premium only on long-horizon work.At $10/$50, Fable 5 costs double Opus 4.8. Anthropic says its lead grows with task length and complexity, so reserve it for multi-day planning and reasoning where the horizon itself is the differentiator — not routine agentic coding throughput.

01The LineupThree models, three price tiers.

For the first time, Anthropic’s frontier is a menu, not a single flagship. Sonnet 5 is the balanced default, Opus 4.8 is the correctness workhorse, and Fable 5 is the general-access, safety- classified version of the Mythos-tier capability. They arrived within five weeks of each other, and their prices span a 5x range from the Sonnet 5 intro rate to Fable 5.

Opus 4.8 shipped on May 28, 2026 with pricing unchanged from Opus 4.7 and a new capability Anthropic calls Dynamic Workflows — Claude plans a large task, fans out potentially hundreds of parallel subagents, and verifies their output against a test suite. Fable 5, from its split-tier launch on June 9, 2026, is the publicly available half of that release.

Balanced default
Claude Sonnet 5
$2/$10 intro · $3/$15 standard

The most agentic Sonnet yet, per Anthropic — plans, uses browsers and terminals, runs autonomously. Same per-token price as Sonnet 4.6. Introductory rate runs through Aug 31, 2026, then standard pricing applies.

Launched Jun 30, 2026
Correctness tier
Claude Opus 4.8
$5/$25 · Fast Mode $10/$50 (~2.5x)

Anthropic's correctness workhorse, pricing unchanged from Opus 4.7. Dynamic Workflows fan out hundreds of parallel subagents and verify against a test suite — built for hundred-thousand-line refactors, not routine single-agent tasks.

Launched May 28, 2026
Frontier tier
Claude Fable 5
$10/$50 · double Opus 4.8

The general-access, safety-classified version of Anthropic's Mythos-tier capability. Anthropic says the longer and more complex the task, the larger its lead over the other models — so its edge is long-horizon, not throughput.

Launched Jun 9, 2026
Anthropic's own positioning
Anthropic frames Sonnet 5 as closing the gap rather than matching the top: “Sonnet 5 narrows the gap: its performance is close to that of Opus 4.8, but at lower prices.” That is the marketing claim to test — close, at lower per-token prices. Whether it is cheaper per finished task depends on the two variables the rest of this guide unpacks: token expansion and effort.

02PricingThe pricing table, with the math recomputed.

Start with the published per-million-token rates and work a single, transparent task through each model. The illustrative task below is one million input tokens plus two hundred thousand output tokens, a single pass with no caching, priced at each model’s list rate. The task-cost and ratio columns are our arithmetic, not a vendor figure — recompute them yourself from the row rates.

Per-million-token input and output prices for Claude Sonnet 5, Opus 4.8, and Fable 5, with an illustrative task cost (1,000,000 input tokens plus 200,000 output tokens, single pass, no caching) and that cost as a ratio of Opus 4.8’s $10.00. Prices from Anthropic’s pricing page, retrieved July 1, 2026; task-cost and ratio columns are Digital Applied arithmetic.
ModelInput $/MOutput $/MIllustrative task*vs Opus 4.8
Introductory window — Sonnet 5 only, through Aug 31, 2026
Sonnet 5 (intro)$2$10$4.0040%
Standard list pricing — from Sep 1, 2026
Sonnet 5 (standard)$3$15$6.0060%
Opus 4.8$5$25$10.00baseline
Fable 5$10$50$20.00200%

* Illustrative task = 1M input + 200K output tokens, single pass, no caching. Note the assumption baked in: equal token counts across models. That is exactly where the comparison gets interesting, because real Sonnet 5 workloads do not emit equal token counts.

On raw per-token price, Sonnet 5 is unambiguously cheaper: $6.00 at standard pricing is 60% of Opus 4.8’s $10.00 for the same tokens, and $4.00 during the intro window is just 40%. But Anthropic’s updated tokenizer produces about 30% more tokens for the same text than Sonnet 4.6’s did — a change that also applies to Opus 4.7 and up, Fable 5, and the Mythos models. More tokens per unit of work quietly erodes the per-token advantage.

Read the fine print
Anthropic states the introductory pricing is calibrated so that migrating from Sonnet 4.6 is roughly cost-neutral despite the extra tokens. In other words, the $2/$10 headline is not a straight discount — it is compensating for the tokenizer’s token expansion. Secondary coverage puts that expansion at 1.0 to 1.35x depending on content type; Anthropic’s headline ~30% sits in the middle of that range, so treat the range as the mechanism behind the single figure, not a contradiction.

03Cost InversionWhere the cheaper model becomes the pricier one.

Artificial Analysis, an independent benchmarking lab, ran the three models through its Intelligence Index at standard (non-promotional) pricing and measured cost per task rather than cost per token. On that basis, Sonnet 5 came in at $2.29 per task versus roughly $1.99 for Opus 4.8 — about 15% higher, despite Sonnet 5’s lower sticker price. The driver was token usage: per the lab, Sonnet 5 used roughly 40% more output tokens per Index task than Sonnet 4.6, and on GDPval-AA it took about three times as many conversational turns. These are Artificial Analysis’s own measurements on their methodology, not a universal cost of running Sonnet 5.

Cost per task · Artificial Analysis Intelligence Index

Source: Artificial Analysis Intelligence Index, June 30, 2026 · standard (post-intro) pricing
Sonnet 4.6Prior-generation Sonnet · reference point
$1.15
Opus 4.8Correctness tier · blended-effort task
$1.99
Sonnet 5Lower per-token price, more tokens per task
$2.29

The crucial nuance — and the reason this is not a simple gotcha — is that the $2.29 figure is a blended-effort average at standard pricing. It does not mean Sonnet 5 is always dearer. During the intro window the per-token gap is wider, and at low and medium effort the token expansion is small enough that Sonnet 5 stays genuinely cheaper. The inversion is an effort-dependent phenomenon, not a fixed property of the model.

Here is the part almost no launch coverage operationalizes: the same lab found max effort uses roughly six times the conversational turns of low effort on GDPval-AA. That six-fold swing dwarfs the roughly two-fold difference in per-token price between Sonnet 5 and Opus 4.8. The practical read is that effort level is the primary cost dial and model choice is the secondary one — most teams have it backwards, agonizing over which model to route to while leaving effort at a reflexive default.

"With Claude Sonnet 5, agents stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost."— Sualeh Asif, Co-founder, Cursor

04BenchmarksCapability gap vs price gap, side by side.

The capability ordering is clear and consistent: on the headline agentic-coding benchmark, SWE-bench Pro, Opus 4.8 leads Sonnet 5, which in turn clears Sonnet 4.6 by a comfortable margin. Fable 5 sits highest, though its figure is Anthropic-reported rather than independently audited. The question is whether that capability gap is worth the price gap for your workload — and on many tasks it is not.

SWE-bench Pro · agentic coding accuracy

Source: Anthropic disclosures via VentureBeat & MarkTechPost; Fable 5 figure vendor-stated
Sonnet 4.6Prior generation · agentic coding
58.1%
Sonnet 5New default · +5.1 pts over 4.6
63.2%
Opus 4.8Correctness tier · +6.0 pts over Sonnet 5
69.2%
Fable 5Anthropic-reported (vendor-stated)
80.3%

Widen the lens beyond a single benchmark and the Sonnet-5-to-Opus-4.8 gap narrows to near-noise on several axes. On agentic knowledge work, they are effectively tied.

Terminal-Bench 2.1
Opus 4.8 leads narrowly
82.7

Opus 4.8 82.7% vs Sonnet 5 80.4% — and both leap over Sonnet 4.6's 67.0%. Sonnet 5's jump on its predecessor is the bigger story than the gap to Opus.

Sonnet 5: 80.4%
GDPval-AA v2
Sonnet 5 edges Opus
1,618

Sonnet 5 1,618 vs Opus 4.8 1,615 on the v2 scale — a statistical tie for agentic knowledge work. Do not compare these to older GDPval-AA numbers from earlier launches.

Opus 4.8: 1,615
Humanity's Last Exam
Near-tie, with tools
57.9

Opus 4.8 57.9% vs Sonnet 5 57.4% with tools — inside the margin of noise. On raw knowledge under tool use, the two are effectively level.

Sonnet 5: 57.4%

The takeaway from the benchmark spread is not that one model wins. It is that Opus 4.8’s advantage is real but concentrated — largest on the hardest coding, negligible on knowledge work where Sonnet 5 sometimes edges ahead. That shape is exactly what makes a per-workload routing strategy pay off rather than a single default.

05Effort MatrixThe effort-level breakeven matrix.

Sonnet 5 and Opus 4.8 share the same five effort levels — low, medium, high, xhigh, and max — replacing the older manual budget_tokens parameter, which now returns a 400 error on both. That shared dial is what makes a clean crossover analysis possible. The matrix below reads left to right: as effort climbs, the cheaper-in-practice model flips from Sonnet 5 to Opus 4.8. The turn-count column is anchored to Artificial Analysis’s GDPval-AA data; the intermediate cells are qualitative because no source publishes a per-tier dollar figure.

For each Sonnet 5 / Opus 4.8 effort level, the model that is cheaper in practice, the relative conversational-turn cost versus low effort, the best-fit work, and the rationale. Turn multipliers (low baseline to about six times at max) are Artificial Analysis GDPval-AA measurements; model recommendations synthesise Artificial Analysis cost data and MarkTechPost effort-band analysis. Digital Applied framework, July 1, 2026.
EffortCheaper in practiceTurns vs lowBest-fit workWhy
LowSonnet 5baselineTriage, classification, routing, simple extractionPer-token discount dominates; adaptive thinking stays light.
MediumSonnet 5modestly higherDay-to-day coding, multi-step automation, structured retrievalIndependent analysis puts low and medium effort in Sonnet 5’s favor over Opus 4.8.
HighBenchmark itmaterially higherHarder multi-file changes, thorough reviewThe crossover zone — token growth starts eating the per-token gap. Measure on your own prompts.
xHighOpus 4.8highCorrectness-critical codingReported that at xhigh, Sonnet 5 cost can exceed Opus 4.8 for comparable quality.
MaxOpus 4.8up to ~6×Hardest refactors, Dynamic WorkflowsEffort, not model, is the dominant cost lever here; Opus 4.8’s max-effort quality edge justifies it.
How to use this matrix
The honest, citable version of the whole argument comes from independent trade-press analysis: best value zones — low/medium effort tasks favor Sonnet 5 over Opus 4.8; at xhigh effort, Sonnet 5 costs can exceed Opus 4.8 for comparable quality. Set your effort level first, then read the row. If you are pinning everything to a single model and a single effort default, you are almost certainly overpaying on one end of the workload spectrum.

06Decision MatrixThe three-model decision matrix.

Fold price, benchmarks, and effort together and the guidance collapses into four archetypes. Each maps a workload shape to a model and an effort band. Note that we deliberately leave Fable 5’s effort configuration unspecified — Anthropic has not confirmed it shares the exact five-level dial, so we position Fable 5 by task type rather than by effort tier.

High-volume agents
Triage, classification, routing

Support-ticket triage, lead scoring, document classification, simple extraction at scale. Effort stays low, tokens stay lean, and Sonnet 5's per-token discount compounds across millions of calls.

Sonnet 5 · low effort
Day-to-day coding
Everyday builds at low/medium effort

Feature work, refactors of moderate scope, multi-step automations. This is the sweet spot where Sonnet 5 is both capable enough and genuinely cheaper than Opus 4.8 — the majority of production agent work lives here.

Sonnet 5 · medium effort
Correctness-critical
Max-effort production refactors

Hundred-thousand-line refactors, migrations that must not regress, anything where a wrong answer is expensive. Opus 4.8 leads the coding benchmarks and, at high/max effort, comes in cheaper per task — pair it with Dynamic Workflows.

Opus 4.8 · high/max
Long-horizon planning
Multi-day reasoning & strategy

Anthropic says Fable 5's lead over its other models grows with task length and complexity. Reserve the 2x price for genuinely long-horizon planning where the horizon itself is the differentiator — not routine throughput.

Fable 5

If your work sits at the boundary between coding and pure frontier reasoning, it is worth reading how the top tier compares across vendors — our breakdown of how Fable 5 stacks up against GPT-5.5 covers the cost trade at the frontier, where the per-task math looks different again.

07Watch-OutsComparison landmines to avoid.

A few traps recur in the coverage of these launches. Two are worth surfacing because they can quietly invalidate a comparison you build in good faith — one about benchmark scales, one about safety behavior that changes what a model will even answer.

Landmine 1 · GDPval-AA scale confusion
Opus 4.8’s GDPval-AA score appears as roughly 1,890 in its May 28 launch coverage and as 1,615 in the June 30 Sonnet 5 comparison. Those are different scales — the later figure is labelled GDPval-AA v2, an apparent rescaling. The exact rescaling methodology has not been published, so never plot the v2 numbers (Sonnet 5 1,618, Opus 4.8 1,615) on the same axis as the earlier GDPval-AA figures from the Opus 4.8 or Fable 5 launches. When a score jumps or drops across launch dates, check whether the benchmark itself was revised before you draw a conclusion.
Landmine 2 · safety classifiers change the answer set
Sonnet 5 is the first Sonnet-tier model with real-time cybersecurity safeguards: requests touching prohibited or high-risk cyber topics may be refused, returned as an HTTP 200 with stop_reason: "refusal" rather than an error. Fable 5 goes further, shipping classifiers for cybersecurity, biology and chemistry, and distillation — when a request trips one, it is automatically answered by Opus 4.8 instead and the user is told. Anthropic reports this fallback happens in fewer than 5% of sessions, but if your evaluation set brushes those topics, you may be silently benchmarking a different model than you think.
"A model that knows when to say no is just as important as one that knows how to build."— Fabian Hedin, Lovable

08In ProductionRouting all three in production.

The operational conclusion is not to standardize on one model. It is to route by workload and to treat effort as a first-class budget control. A pragmatic default stack looks like this — Sonnet 5 carries the volume, Opus 4.8 handles the hard correctness work, and Fable 5 is held in reserve for genuinely long-horizon tasks.

Default agent
Sonnet 5 · low/medium
S5

High-volume triage, classification, day-to-day automation and coding. The per-token discount dominates while effort stays low — this is where most calls should land.

cheapest at scale
Correctness
Opus 4.8 · high/max
O4.8

Correctness-critical refactors and Dynamic Workflows. At high and max effort it comes in cheaper per task than Sonnet 5 for comparable quality, per Artificial Analysis's blended measurement.

~15% less at blended effort
Long-horizon
Fable 5 · reserve
F5

Multi-day planning where task length itself is the differentiator. Anthropic says its lead grows with complexity — worth double the price only when the horizon justifies it.

2× Opus price

Two guardrails make this stack hold up. First, batch anything latency-tolerant: the Batch API halves the rate, so Sonnet 5 intro batch runs at $1/$5 per million and Opus 4.8 batch at $2.50/$12.50 — a straight 50% cut on the numbers above. Second, lean on prompt caching for repeated context; a cache read is billed at one-tenth of base input across all three models, which can dominate the economics of any agent that re-reads a large system prompt or codebase.

The larger pattern to watch through the back half of 2026: Anthropic has stopped shipping a single flagship and started shipping a price- segmented ladder, with the tokenizer and effort dial doing more to determine real cost than the sticker price. Teams that build routing and effort control into their agent infrastructure now will adapt to the next tier reshuffle without re-plumbing; teams that hardcode a single model and effort default will keep overpaying every time the ladder shifts. If you want a second pair of hands on that routing logic, our agentic AI transformation engagements start with exactly this kind of per-workload model and effort eval.

09ConclusionPick per workload, not per headline.

Conclusion · Picking a Claude tier

The cheap model is only cheap until you turn up the effort.

Sonnet 5 is the right default for most agent work, and at low and medium effort it genuinely saves money against Opus 4.8. That is the honest headline, and it should not get lost in the more surprising finding that at high and max effort the per-task cost can invert. Both are true because effort, not model choice, is the dominant cost lever — a six-fold swing in conversational turns from low to max dwarfs the two-fold gap in per-token price.

So the decision is not “which model is best.” It is a two-step routine: set the effort level the task actually needs, then pick the model that is cheaper in practice at that level. High-volume, low-effort work goes to Sonnet 5; correctness-critical, max-effort work goes to Opus 4.8; genuinely long-horizon planning is where Fable 5’s premium pays for itself. Treat the $2.29-versus-$1.99 comparison as one lab’s blended-effort snapshot, not a verdict — your own workload mix is the only benchmark that settles it.

The broader signal is that Anthropic’s frontier is now a price-segmented ladder rather than a single flagship, and the levers that move real cost — token expansion and effort — sit below the sticker price where most buyers never look. The teams that win the next year of agent economics will be the ones who measure cost per finished task on their own prompts and build effort control into the stack, not the ones who chase the lowest number on the pricing page.

Route the right model to the right workload

Cost per finished task, not sticker price, is the metric that actually decides.

We help teams benchmark Claude Sonnet 5, Opus 4.8, and Fable 5 on their own workloads, build per-task cost models, and ship model-and-effort routing into production agent infrastructure — measured on cost per finished task, not sticker price.

Free consultationExpert guidanceTailored solutions
What we work on

Model-routing engagements

  • Cost-per-task benchmarking across Sonnet 5 / Opus 4.8 / Fable 5
  • Effort-level tuning to control agent spend
  • Per-workload routing logic with fallbacks
  • Prompt caching & Batch API cost programs
  • Governance for safety-classifier refusals & fallbacks
FAQ · Which Claude model to use

The questions we get every week.

On per-token price, yes: Sonnet 5 lists at $2/$10 per million during its introductory window (through August 31, 2026) and $3/$15 afterward, versus Opus 4.8 at $5/$25. But per finished task the picture is more nuanced. Independent lab Artificial Analysis measured Sonnet 5 at about $2.29 per task versus roughly $1.99 for Opus 4.8 on their Intelligence Index at standard pricing — about 15% higher — because Sonnet 5 emits more tokens per task. Crucially, that is a blended-effort average. At low and medium effort Sonnet 5 stays genuinely cheaper; the inversion only appears at high and max effort. So the answer depends entirely on how hard you push the effort dial.
Related dispatches

Continue exploring frontier releases.