MarketingFramework12 min readPublished July 2, 2026

One banked planning session · months of cheap execution · included window closes Jul 7

Bank Your AI Plans: Fable 5 Now, Cheap Models Later

Claude Fable 5 moves from included usage to metered credits after July 7, 2026. The highest-leverage way to spend the remaining window isn’t more drafts — it’s durable plan artifacts: calendars, roadmaps, and briefs with acceptance criteria that cheaper models can execute against for months.

DA
Digital Applied Team
Senior strategists · Published Jul 2, 2026
PublishedJul 2, 2026
Read time12 min
Sources8 primary sources
Included window ends
Jul 7
then metered usage credits
Fable 5 output price
$50/Mtok
2x Opus 4.8 · 5x Sonnet 5 intro
Three-tier routing session
−51%
$0.98 vs $2.02 uniform frontier
Augment Code
GitHub Spec Kit
117k
stars · live count, Jul 2, 2026

Banking your AI plans — spending Claude Fable 5’s remaining included-usage window on durable planning artifacts instead of routine drafts — is the highest-leverage move a marketing or product team can make before July 7, 2026. After that date, Anthropic’s announced schedule moves Fable 5 from included plan usage to metered usage credits, and every frontier-model session becomes a line item.

The economics force a real decision. On list pricing, Fable 5 costs $10 per million input tokens and $50 per million output tokens — double Opus 4.8 and five times Sonnet 5’s introductory output rate. Teams that spend the window generating another batch of social captions will have consumed a scarce resource on work a cheaper model handles adequately. Teams that spend it producing a quarterly calendar, an SEO roadmap, or a build spec with explicit acceptance criteria walk away with an asset that keeps paying off after the meter starts.

This framework covers the deadline mechanics, the pricing gradient that makes plan-banking rational, what separates a bankable plan from a pile of prompts, the engineering patterns that already prove the model works, a plan-artifact anatomy you can copy, and — just as important — the cases where banking a plan is the wrong call.

Key takeaways
  1. 01
    July 7 is a capital-allocation deadline.Anthropic’s announced schedule includes Fable 5 for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans through July 7, 2026 — after that it bills as usage credits at standard API rates, with a stated $2,000/day maximum.
  2. 02
    The cost gradient rewards planning, not drafting.Fable 5 lists at $10/$50 per million tokens versus $5/$25 for Opus 4.8 and $2/$10 for Sonnet 5’s intro rate. Frontier output is 2x to 5x the price — so route it to the artifacts that steer months of work, not the work itself.
  3. 03
    A bankable plan is a spec, not a brainstorm.Spec-driven development already formalizes this: a versioned artifact with acceptance criteria, ordered tasks, and verification gates that a different — cheaper — model can execute against. GitHub’s Spec Kit (117k stars as of July 2) ships the four-phase structure.
  4. 04
    Engineering has the working existence proof.The open-source architect-loop pattern has Fable 5 write gated specs that GPT-5.5 Codex builders execute in isolated lanes, with read-only test commands as acceptance gates. The same handoff contract translates directly to calendars and campaign briefs.
  5. 05
    Plan-banking is a one-time investment, not a routing religion.Community counter-examples show orchestration overhead can cost more than direct frontier calls for small tasks. Bank plans for ambiguity-heavy, long-horizon work; send routine drafts straight to a cheaper model.

01The DeadlineJuly 7 turns Fable 5 from included to metered.

The timeline matters because it compresses the decision. Fable 5 was restored globally on July 1, 2026 — across the Claude Platform, Claude.ai, Claude Code, and Claude Cowork — after the export-control restrictions imposed on June 12 were lifted on June 30. That restoration came with a clock attached.

Anthropic’s announced schedule
“For Pro, Max, Team, and select Enterprise plans, Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits.” After the cutover, credits bill at standard API rates with a stated $2,000/day maximum, separate from the subscription. Anthropic has not published a credit-to-dollar conversion beyond API pricing — treat any specific ratio you see elsewhere as unconfirmed.

Read as a resource-allocation problem, the announcement defines a window: several days of frontier-model capacity that is already paid for, followed by an indefinite period where the same capacity is billed per token. Anything Fable 5 produces inside the window that stays useful after it — a plan, a spec, a standing rule set — is effectively banked value. Anything it produces that a cheaper model could have produced is consumption. The mechanics of the credits system itself are covered in our usage-credits pricing guide; this post is about what to do with the window.

02The Cost GradientWhy the pricing gap makes plans the rational spend.

The argument rests on a simple gradient. On Anthropic’s list pricing as of July 2, 2026, Fable 5 runs $10 per million input tokens and $50 per million output tokens. Opus 4.8 is $5/$25. Sonnet 5 carries an introductory $2/$10 through August 31, 2026, then moves to $3/$15. Haiku 4.5 sits at $1/$5. On output tokens — the tokens a content operation actually pays for at volume — Fable 5 is exactly 2x Opus 4.8 and 5x Sonnet 5’s introductory rate.

Output price per million tokens · Claude model tiers

Source: Anthropic list pricing (anthropic.com/claude/fable-5), retrieved July 2, 2026
Fable 5$10 input · frontier planning tier
$50
Opus 4.8$5 input · heavy execution tier
$25
Sonnet 5 (list, from Sep 1)$3 input · announced post-intro rate
$15
Sonnet 5 (intro, thru Aug 31)$2 input · workhorse execution tier
$10
Haiku 4.5$1 input · navigation and simple tasks
$5

Two mitigations soften the post-July-7 picture without changing the conclusion. Cache reads run roughly 90% off list — about $1 per million tokens for Fable 5 cached input — and the Batch API is 50% off ($5/$25 for batched Fable 5). Both reward exactly the plan-then-execute shape: a stable plan document sitting in cache is cheap to re-read; deferred bulk execution is cheap to batch.

Why pay the frontier premium for planning at all? Because the quality gap is real precisely on the hard, ambiguous work. Third-party testing puts Fable 5 at 80.3% on SWE-Bench Pro versus 58.6% for GPT-5.5 — while on OpenAI’s own Terminal-Bench 2.1 chart, Fable 5 reads at roughly 82.5–84.3 against GPT-5.6 Sol’s 88.8, so the picture is contested rather than one-sided. Users in Claude-focused communities report Fable 5 output around 10–25% better than Opus with fewer mistakes — individual anecdotes, not a controlled study, but directionally consistent with the benchmarks’ spread: the frontier model separates most on the work where being wrong is expensive. Our model decision matrix covers which model should execute the plan afterward.

The trend worth naming: cost-aware model routing has moved from optimization trick to default posture. Augment Code’s routing guide models a three-tier session — frontier for planning, Sonnet-class for implementation, Haiku-class for navigation — at $0.98 versus $2.02 for uniform frontier use, a 51% reduction, with file navigation on a frontier model running roughly 5x the cheapest tier. Community threads now argue the Fable 5 pricing gap makes explicit routing policy mandatory rather than optional. Plan-banking is the marketing translation of that shift: route the expensive model to the decisions, the cheap models to the volume.

03Bankable PlansA bankable plan is a spec, not a brainstorm.

The difference between a plan that banks and a plan that rots is whether a different, cheaper model can execute it without you. Engineering already has a name for this discipline: spec-driven development. The versioned, structured specification — not the code — is the source of truth; the executing agent reads the spec, breaks it into tasks, produces the work, and verifies the output against the original acceptance criteria. Nothing in that sentence requires the executing agent to be the same model that wrote the spec.

The pattern is mainstream, not fringe. GitHub’s open-source Spec Kit toolkit sat at 117k stars and 10.4k forks as measured live on July 2, 2026 — a moving figure, but the magnitude is the point — with its v0.12.3 release shipped July 1 and support for 30+ AI coding agents. Its workflow has four explicit phases, and each one maps cleanly onto a marketing plan artifact:

Phase 1
Specify
/speckit.specify → the brief

What to build and why. Marketing equivalent: the campaign or content brief — audience, positioning, message architecture, what success looks like. This is where ambiguity gets resolved, which is exactly what the frontier model is for.

What, not how
Phase 2
Plan the artifact
/speckit.plan → the calendar / roadmap

Architecture and constraint decisions. Marketing equivalent: the quarterly calendar or SEO roadmap — channels, cadence, keyword targets, dependencies between assets. The document a cheaper model will read every session for months.

The banked asset
Phase 3
Tasks
/speckit.tasks → the ordered queue

A dependency-ordered task list. Marketing equivalent: per-week, per-channel production items with owners and inputs named — small enough that a Sonnet-tier model executes one at a time without needing global judgment.

Cheap-model sized
Phase 4
Implement
/speckit.implement → production

Execute in sequence, verify against the spec. Marketing equivalent: draft, adapt, and schedule against the calendar — with the acceptance checklist from the banking session as the pass/fail gate before anything ships.

Post-July-7 work

Two more pieces complete the toolkit. Claude Code’s Plan Mode is the native mechanism for producing the artifact today: a read-only reasoning state that reads and searches source material without side effects, then returns a numbered plan naming what changes, the logic involved, and the risks — which a human approves before execution begins. And Addy Osmani’s spec-writing guidance supplies the structure: start with a concise high-level spec, have the AI expand it into a detailed plan, and shape it like a professional PRD across six areas — exact steps, testing and verification, structure, style, workflow, and explicit three-tier boundaries (always execute / ask first / never). That last area is what makes unattended execution safe.

“Vague prompts mean wrong results.”— Addy Osmani, on writing specs AI agents can execute

The specificity is the entire trick. A vague plan forces the executing model to re-make strategic decisions every session — which is precisely the work it’s worst at and the frontier model already did. Osmani’s recommended closing step applies verbatim to marketing: after execution, compare the result with the spec and confirm all requirements are met.

04Existence ProofFrontier plans, cheap builds: already running in public.

If plan-banking sounds theoretical, the clearest counter-evidence is a working, MIT-licensed GitHub project. The architect-loop pattern (551 stars as of July 2, 2026) has Fable 5 act as architect and GPT-5.5 Codex act as builder in isolated git worktrees. The architect writes a spec scoping one-PR-slice work, split into one to four parallel lanes with declared file ownership per lane, plus acceptance gates committed as read-only test commands. Builders get one fresh execution run per lane, hold no commit access, and are instructed to “argue with the spec before building.” When a builder claims done, the architect re-runs the gates itself and reads the diff against the spec’s intent rather than trusting the builder’s pass/fail claim.

Every element has a marketing twin. Declared file ownership per lane becomes channel or asset ownership per calendar week — the social lane doesn’t touch the email sequence. Read-only test commands become a fixed pre-publish checklist the executing model can self-grade against. And “argue with the spec before building” becomes the healthiest instruction you can give a cheap drafting model: surface conflicts with the brief before producing, don’t silently improvise around them.

The verification half of the pattern is documented by Anthropic’s own engineering team. In Claude Code’s auto mode, a Stop hook can run a check as a script and block a turn from ending until the check passes — with an override after eight consecutive blocks to prevent infinite loops. The team’s general principle: give the agent a check it can run — tests, a build, a screenshot to compare. That is what turns a plan into something a cheaper, less-supervised model can execute unattended.

The handoff contract
The banked artifact isn’t the calendar alone — it’s the calendar plus the acceptance criteria and verification gates that let a model you’re not supervising know, on its own, whether its output is acceptable. A plan without gates delegates the work; a plan with gates delegates the judgment. Only the second one banks.

05The FrameworkThe plan-artifact anatomy: four artifacts worth banking.

The table below is our synthesis — it translates Spec Kit’s four-phase workflow, Osmani’s PRD structure, and architect-loop’s gated-lane mechanics into the four marketing and product artifacts most worth a Fable 5 banking session this week. Existing content-calendar and SEO-template guides describe the artifact; what they skip is the handoff contract — the columns on the right — that lets a cheaper model execute it unsupervised for months.

Plan-artifact anatomy: for each of four artifact types, what Fable 5 should produce in the banking session, the acceptance criteria to embed, the ordered execution steps for a cheaper model, and the verification gate before shipping.
ArtifactBanking-session output (Fable 5)Acceptance criteria to embedExecution steps for the cheap modelVerification gate before ship
Quarterly content calendar13 weeks of topics with angle, audience segment, funnel stage, and channel/asset ownership per week — the marketing version of file ownership per lane.Per-piece: target keyword or hook, internal links required, banned claims list, voice rules stated as standing rules.One week at a time: draft → self-grade against criteria → adapt per channel → queue for human sign-off.Checklist self-grade passes on every item; any criterion the draft can’t meet is escalated, never improvised around.
SEO roadmapPrioritized page and cluster map with intent, dependencies between assets, and the decision log for why each target was chosen.Per-page: title and heading constraints, entities to cover, links in/out required, what NOT to cannibalize.Dependency order, one page per session: outline → draft → interlink pass → metadata — each step checked before the next.Page satisfies its own spec row and breaks no cannibalization rule; deviations logged against the decision log.
PRD / build specOsmani’s six areas: exact steps, testing and verification, structure, style, workflow, and three-tier boundaries (always / ask first / never).Testable requirements plus the boundary tiers — the “never” list is what makes unattended execution safe.Task-by-task in dependency order, one-PR-sized slices; builder argues with the spec before building.Gates re-run by the reviewer, not trusted from the builder; diff read against the spec’s intent.
Campaign briefPositioning, message architecture, offer logic, and the rejected alternatives with reasons — so the executor never re-litigates settled strategy.Claim substantiation rules, compliance constraints, mandatory and banned phrases per audience and region.Produce variants per channel from the message architecture; no new claims introduced at execution time.Every claim traces to the brief’s substantiation list; a human approves anything the gate can’t decide.

The pattern across all four rows: the left two columns are where Fable 5 earns its price — resolving ambiguity, making and documenting decisions. The right two columns are deliberately mechanical, which is what makes them safe to hand to Sonnet 5 or Opus 4.8. If you can’t write the verification-gate cell for an artifact, that’s the signal it isn’t ready to bank.

06This WeekThe banking playbook: what to produce before the meter starts.

Fable 5’s specs are built for exactly this kind of session: a 1M-token context window by default and up to 128k output tokens per request, with adaptive thinking as its only reasoning mode. In practice, a single session can ingest a large brand corpus — past performance data, voice guidelines, personas, the competitive picture — and still return a long, detailed plan document in one pass. That is the precondition for one banked session carrying months of execution.

Context window
One session, whole corpus
1Mtokens

Brand guidelines, analytics exports, past campaign postmortems, and personas fit in a single planning pass — no summarize-then-lose-detail relay between sessions.

Default, per Anthropic model docs
Max output
Plan documents in one pass
128ktokens

Long enough to return a full quarterly calendar with per-piece acceptance criteria, or a complete PRD with boundary tiers, without splitting the artifact across sessions.

Per request
Routing dividend
What execution costs after banking
51%

Augment Code’s session model prices three-tier routing at $0.98 versus $2.02 for uniform frontier use. File navigation on a frontier model runs roughly 5x the cheapest tier — pure waste once the plan exists.

augmentcode.com routing guide

The early-adopter record shows the shape working. In the first 72 hours after Fable 5’s original launch, one documented builder, Rich Carr, had Fable produce a multi-phase scope of work that became the running specification for later sessions — in his words, “Each correction becomes a standing rule rather than a one-temp fix.” Another, Hans van Gent, built a /reflect skill that extracts durable facts, corrections, and workflows at session end and writes them into config so the next session starts smarter — letting a cheaper model execute against the standard the frontier model set once. A third had Fable solve a one-off API integration for an analytics dashboard; the resulting solution now refreshes every morning on a 24-hour task without the frontier model re-engaged. The beat-the-deadline genre is already its own content category — “Claude Fable 5 Is Finally Back: 5 Must-Try Use Cases Before July 7” was on YouTube within a day of the restoration.

Concretely, for a marketing team, the banking week looks like three or four sessions: one on a full quarterly content calendar, one on an SEO roadmap built the same way, and one or two on the campaign briefs and standing rules that govern everything else — the same plan-artifact structure our content engine service runs for client content operations. For the downstream economics — what execution costs per post at each tier once the plan exists — see the cost-per-post math across maturity tiers. One consultant reported billing around $1,200 a day on Fable 5-assisted delivery and called the subscription self-funding — one person’s anecdote, not a benchmark, but a useful sketch of what banked leverage looks like when it lands.

“Use it for the big jobs that earn it, and keep a lighter model for everything else.”— AI Blew My Mind, on sustainable Fable 5 use, June 2026

07The Counter-CaseWhen banking a plan is the wrong call.

The honest version of this framework includes its failure modes. The most instructive counter-example comes from a developer who documented burning $21 trying to prove an AI orchestrator could beat direct Fable 5 calls — and found that for some tasks, the routing overhead cost more than it saved. The lesson isn’t that routing is wrong; it’s that plan-banking is a deliberate one-time investment in a durable artifact, not a general argument for building a permanent orchestration layer. If the task is small, self-contained, and won’t recur, the cheapest correct move is usually a single direct call to whichever model fits.

The marketing-side framing of the same principle appeared the week this post published: “cheap and fast for drafts, frontier models for ambiguity, specialist tools for media, humans for judgement.” Banking only makes sense for the ambiguity bucket — work where the expensive decisions, once made, stay made.

Bank it
Long-horizon, ambiguity-heavy work

Quarterly calendars, SEO roadmaps, PRDs, campaign briefs — decisions that steer months of execution and reward frontier judgment once. Write the acceptance criteria and gates in the same session.

Fable 5, before July 7
Don’t bank
Routine, recurring drafts

Social captions, adaptations, resizes, summaries. A cheaper model handles these adequately with or without a banked plan — spending frontier included-usage here is consumption, not investment.

Sonnet 5 / Haiku direct
Bank shorter
Fast-moving channels

Paid social angles, trend-reactive formats, anything where the strategic picture shifts monthly. Bank a one-month horizon with an explicit review date instead of a quarter — a stale plan executed faithfully is worse than no plan.

Short-horizon artifact
Skip the layer
One-off, self-contained tasks

The $21 orchestrator lesson: when a task won’t recur, routing and plan overhead can exceed the frontier premium itself. Make the single direct call and move on.

Direct call, no plan
One operational footnote
Fable 5’s relaunch safety classifier blocks the Amazon-reported jailbreak technique at better than 99%, but produces more false positives on routine coding and work tasks — and flagged prompts fall back to Opus 4.8 (in under 5% of sessions, per the pre-incident figure). A banking session that trips the classifier mid-task could quietly downgrade models. Worth checking which model actually produced your plan artifact before you treat it as banked — not a reason to skip the session.

08ConclusionSpend the window on decisions, not drafts.

The capital-allocation read

A banked plan keeps paying off after the meter starts.

July 7 is best understood as a one-time capital-allocation decision. The included window is a scarce, expiring resource; routine drafts spend it on work a cheaper model would have done adequately anyway, while plan artifacts convert it into something durable — decisions with gates that Opus 4.8 and Sonnet 5 can execute against for months at a half to a fifth of the output price.

The pattern requires no faith. Spec-driven development, GitHub’s Spec Kit, Claude Code’s Plan Mode, and the architect-loop architect/builder split are all shipped, public, and working — marketing teams are borrowing a proven structure, not betting on a new one. The only genuinely new work is translation: channel ownership instead of file ownership, pre-publish checklists instead of test commands, briefs instead of PRDs.

Looking forward, the July 7 cutover is unlikely to be the last time a frontier model moves from included to metered — the economics of frontier capacity point the same direction across every provider. Teams that learn to bank plans now are building the muscle that matters for the next window too: knowing exactly which work earns the expensive model, and having the artifact structure ready so that everything else doesn’t need it.

Plan once, execute for months

Bank the strategy while it’s included — then let cheaper models do the running.

Our content engine runs exactly this pattern — senior-led planning artifacts with acceptance criteria, executed by cost-routed AI pipelines with verification gates before anything ships.

Free consultationExpert guidanceTailored solutions
What we work on

Plan-artifact engagements

  • Quarterly content calendars with acceptance criteria
  • SEO roadmaps built as executable specs
  • Campaign briefs with claim-substantiation gates
  • Cost-routed model pipelines — frontier plans, cheap execution
  • Verification checklists your team can self-grade against
FAQ · Plan banking

Plan-banking questions, answered.

Per Anthropic’s announced schedule, Fable 5 is included for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans through July 7, 2026. After that date it becomes available via usage credits — billed at standard API rates, separate from the subscription, with a stated $2,000/day maximum and the usual auto-reload threshold and monthly spend cap controls. Anthropic has not published any credit-to-dollar conversion beyond standard API pricing, so treat specific ratios circulating elsewhere as unconfirmed. Practically: frontier-model work is effectively pre-paid through July 7 and pay-per-token after, which is what makes the remaining window a genuine allocation decision rather than business as usual.
Related dispatches

Continue the Fable 5 playbook.