Banking your AI plans — spending Claude Fable 5’s remaining included-usage window on durable planning artifacts instead of routine drafts — is the highest-leverage move a marketing or product team can make before July 7, 2026. After that date, Anthropic’s announced schedule moves Fable 5 from included plan usage to metered usage credits, and every frontier-model session becomes a line item.
The economics force a real decision. On list pricing, Fable 5 costs $10 per million input tokens and $50 per million output tokens — double Opus 4.8 and five times Sonnet 5’s introductory output rate. Teams that spend the window generating another batch of social captions will have consumed a scarce resource on work a cheaper model handles adequately. Teams that spend it producing a quarterly calendar, an SEO roadmap, or a build spec with explicit acceptance criteria walk away with an asset that keeps paying off after the meter starts.
This framework covers the deadline mechanics, the pricing gradient that makes plan-banking rational, what separates a bankable plan from a pile of prompts, the engineering patterns that already prove the model works, a plan-artifact anatomy you can copy, and — just as important — the cases where banking a plan is the wrong call.
- 01July 7 is a capital-allocation deadline.Anthropic’s announced schedule includes Fable 5 for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans through July 7, 2026 — after that it bills as usage credits at standard API rates, with a stated $2,000/day maximum.
- 02The cost gradient rewards planning, not drafting.Fable 5 lists at $10/$50 per million tokens versus $5/$25 for Opus 4.8 and $2/$10 for Sonnet 5’s intro rate. Frontier output is 2x to 5x the price — so route it to the artifacts that steer months of work, not the work itself.
- 03A bankable plan is a spec, not a brainstorm.Spec-driven development already formalizes this: a versioned artifact with acceptance criteria, ordered tasks, and verification gates that a different — cheaper — model can execute against. GitHub’s Spec Kit (117k stars as of July 2) ships the four-phase structure.
- 04Engineering has the working existence proof.The open-source architect-loop pattern has Fable 5 write gated specs that GPT-5.5 Codex builders execute in isolated lanes, with read-only test commands as acceptance gates. The same handoff contract translates directly to calendars and campaign briefs.
- 05Plan-banking is a one-time investment, not a routing religion.Community counter-examples show orchestration overhead can cost more than direct frontier calls for small tasks. Bank plans for ambiguity-heavy, long-horizon work; send routine drafts straight to a cheaper model.
01 — The DeadlineJuly 7 turns Fable 5 from included to metered.
The timeline matters because it compresses the decision. Fable 5 was restored globally on July 1, 2026 — across the Claude Platform, Claude.ai, Claude Code, and Claude Cowork — after the export-control restrictions imposed on June 12 were lifted on June 30. That restoration came with a clock attached.
Read as a resource-allocation problem, the announcement defines a window: several days of frontier-model capacity that is already paid for, followed by an indefinite period where the same capacity is billed per token. Anything Fable 5 produces inside the window that stays useful after it — a plan, a spec, a standing rule set — is effectively banked value. Anything it produces that a cheaper model could have produced is consumption. The mechanics of the credits system itself are covered in our usage-credits pricing guide; this post is about what to do with the window.
02 — The Cost GradientWhy the pricing gap makes plans the rational spend.
The argument rests on a simple gradient. On Anthropic’s list pricing as of July 2, 2026, Fable 5 runs $10 per million input tokens and $50 per million output tokens. Opus 4.8 is $5/$25. Sonnet 5 carries an introductory $2/$10 through August 31, 2026, then moves to $3/$15. Haiku 4.5 sits at $1/$5. On output tokens — the tokens a content operation actually pays for at volume — Fable 5 is exactly 2x Opus 4.8 and 5x Sonnet 5’s introductory rate.
Output price per million tokens · Claude model tiers
Source: Anthropic list pricing (anthropic.com/claude/fable-5), retrieved July 2, 2026Two mitigations soften the post-July-7 picture without changing the conclusion. Cache reads run roughly 90% off list — about $1 per million tokens for Fable 5 cached input — and the Batch API is 50% off ($5/$25 for batched Fable 5). Both reward exactly the plan-then-execute shape: a stable plan document sitting in cache is cheap to re-read; deferred bulk execution is cheap to batch.
Why pay the frontier premium for planning at all? Because the quality gap is real precisely on the hard, ambiguous work. Third-party testing puts Fable 5 at 80.3% on SWE-Bench Pro versus 58.6% for GPT-5.5 — while on OpenAI’s own Terminal-Bench 2.1 chart, Fable 5 reads at roughly 82.5–84.3 against GPT-5.6 Sol’s 88.8, so the picture is contested rather than one-sided. Users in Claude-focused communities report Fable 5 output around 10–25% better than Opus with fewer mistakes — individual anecdotes, not a controlled study, but directionally consistent with the benchmarks’ spread: the frontier model separates most on the work where being wrong is expensive. Our model decision matrix covers which model should execute the plan afterward.
The trend worth naming: cost-aware model routing has moved from optimization trick to default posture. Augment Code’s routing guide models a three-tier session — frontier for planning, Sonnet-class for implementation, Haiku-class for navigation — at $0.98 versus $2.02 for uniform frontier use, a 51% reduction, with file navigation on a frontier model running roughly 5x the cheapest tier. Community threads now argue the Fable 5 pricing gap makes explicit routing policy mandatory rather than optional. Plan-banking is the marketing translation of that shift: route the expensive model to the decisions, the cheap models to the volume.
03 — Bankable PlansA bankable plan is a spec, not a brainstorm.
The difference between a plan that banks and a plan that rots is whether a different, cheaper model can execute it without you. Engineering already has a name for this discipline: spec-driven development. The versioned, structured specification — not the code — is the source of truth; the executing agent reads the spec, breaks it into tasks, produces the work, and verifies the output against the original acceptance criteria. Nothing in that sentence requires the executing agent to be the same model that wrote the spec.
The pattern is mainstream, not fringe. GitHub’s open-source Spec Kit toolkit sat at 117k stars and 10.4k forks as measured live on July 2, 2026 — a moving figure, but the magnitude is the point — with its v0.12.3 release shipped July 1 and support for 30+ AI coding agents. Its workflow has four explicit phases, and each one maps cleanly onto a marketing plan artifact:
Specify
What to build and why. Marketing equivalent: the campaign or content brief — audience, positioning, message architecture, what success looks like. This is where ambiguity gets resolved, which is exactly what the frontier model is for.
Plan the artifact
Architecture and constraint decisions. Marketing equivalent: the quarterly calendar or SEO roadmap — channels, cadence, keyword targets, dependencies between assets. The document a cheaper model will read every session for months.
Tasks
A dependency-ordered task list. Marketing equivalent: per-week, per-channel production items with owners and inputs named — small enough that a Sonnet-tier model executes one at a time without needing global judgment.
Implement
Execute in sequence, verify against the spec. Marketing equivalent: draft, adapt, and schedule against the calendar — with the acceptance checklist from the banking session as the pass/fail gate before anything ships.
Two more pieces complete the toolkit. Claude Code’s Plan Mode is the native mechanism for producing the artifact today: a read-only reasoning state that reads and searches source material without side effects, then returns a numbered plan naming what changes, the logic involved, and the risks — which a human approves before execution begins. And Addy Osmani’s spec-writing guidance supplies the structure: start with a concise high-level spec, have the AI expand it into a detailed plan, and shape it like a professional PRD across six areas — exact steps, testing and verification, structure, style, workflow, and explicit three-tier boundaries (always execute / ask first / never). That last area is what makes unattended execution safe.
“Vague prompts mean wrong results.”— Addy Osmani, on writing specs AI agents can execute
The specificity is the entire trick. A vague plan forces the executing model to re-make strategic decisions every session — which is precisely the work it’s worst at and the frontier model already did. Osmani’s recommended closing step applies verbatim to marketing: after execution, compare the result with the spec and confirm all requirements are met.
04 — Existence ProofFrontier plans, cheap builds: already running in public.
If plan-banking sounds theoretical, the clearest counter-evidence is a working, MIT-licensed GitHub project. The architect-loop pattern (551 stars as of July 2, 2026) has Fable 5 act as architect and GPT-5.5 Codex act as builder in isolated git worktrees. The architect writes a spec scoping one-PR-slice work, split into one to four parallel lanes with declared file ownership per lane, plus acceptance gates committed as read-only test commands. Builders get one fresh execution run per lane, hold no commit access, and are instructed to “argue with the spec before building.” When a builder claims done, the architect re-runs the gates itself and reads the diff against the spec’s intent rather than trusting the builder’s pass/fail claim.
Every element has a marketing twin. Declared file ownership per lane becomes channel or asset ownership per calendar week — the social lane doesn’t touch the email sequence. Read-only test commands become a fixed pre-publish checklist the executing model can self-grade against. And “argue with the spec before building” becomes the healthiest instruction you can give a cheap drafting model: surface conflicts with the brief before producing, don’t silently improvise around them.
The verification half of the pattern is documented by Anthropic’s own engineering team. In Claude Code’s auto mode, a Stop hook can run a check as a script and block a turn from ending until the check passes — with an override after eight consecutive blocks to prevent infinite loops. The team’s general principle: give the agent a check it can run — tests, a build, a screenshot to compare. That is what turns a plan into something a cheaper, less-supervised model can execute unattended.
05 — The FrameworkThe plan-artifact anatomy: four artifacts worth banking.
The table below is our synthesis — it translates Spec Kit’s four-phase workflow, Osmani’s PRD structure, and architect-loop’s gated-lane mechanics into the four marketing and product artifacts most worth a Fable 5 banking session this week. Existing content-calendar and SEO-template guides describe the artifact; what they skip is the handoff contract — the columns on the right — that lets a cheaper model execute it unsupervised for months.
| Artifact | Banking-session output (Fable 5) | Acceptance criteria to embed | Execution steps for the cheap model | Verification gate before ship |
|---|---|---|---|---|
| Quarterly content calendar | 13 weeks of topics with angle, audience segment, funnel stage, and channel/asset ownership per week — the marketing version of file ownership per lane. | Per-piece: target keyword or hook, internal links required, banned claims list, voice rules stated as standing rules. | One week at a time: draft → self-grade against criteria → adapt per channel → queue for human sign-off. | Checklist self-grade passes on every item; any criterion the draft can’t meet is escalated, never improvised around. |
| SEO roadmap | Prioritized page and cluster map with intent, dependencies between assets, and the decision log for why each target was chosen. | Per-page: title and heading constraints, entities to cover, links in/out required, what NOT to cannibalize. | Dependency order, one page per session: outline → draft → interlink pass → metadata — each step checked before the next. | Page satisfies its own spec row and breaks no cannibalization rule; deviations logged against the decision log. |
| PRD / build spec | Osmani’s six areas: exact steps, testing and verification, structure, style, workflow, and three-tier boundaries (always / ask first / never). | Testable requirements plus the boundary tiers — the “never” list is what makes unattended execution safe. | Task-by-task in dependency order, one-PR-sized slices; builder argues with the spec before building. | Gates re-run by the reviewer, not trusted from the builder; diff read against the spec’s intent. |
| Campaign brief | Positioning, message architecture, offer logic, and the rejected alternatives with reasons — so the executor never re-litigates settled strategy. | Claim substantiation rules, compliance constraints, mandatory and banned phrases per audience and region. | Produce variants per channel from the message architecture; no new claims introduced at execution time. | Every claim traces to the brief’s substantiation list; a human approves anything the gate can’t decide. |
The pattern across all four rows: the left two columns are where Fable 5 earns its price — resolving ambiguity, making and documenting decisions. The right two columns are deliberately mechanical, which is what makes them safe to hand to Sonnet 5 or Opus 4.8. If you can’t write the verification-gate cell for an artifact, that’s the signal it isn’t ready to bank.
06 — This WeekThe banking playbook: what to produce before the meter starts.
Fable 5’s specs are built for exactly this kind of session: a 1M-token context window by default and up to 128k output tokens per request, with adaptive thinking as its only reasoning mode. In practice, a single session can ingest a large brand corpus — past performance data, voice guidelines, personas, the competitive picture — and still return a long, detailed plan document in one pass. That is the precondition for one banked session carrying months of execution.
One session, whole corpus
Brand guidelines, analytics exports, past campaign postmortems, and personas fit in a single planning pass — no summarize-then-lose-detail relay between sessions.
Plan documents in one pass
Long enough to return a full quarterly calendar with per-piece acceptance criteria, or a complete PRD with boundary tiers, without splitting the artifact across sessions.
What execution costs after banking
Augment Code’s session model prices three-tier routing at $0.98 versus $2.02 for uniform frontier use. File navigation on a frontier model runs roughly 5x the cheapest tier — pure waste once the plan exists.
The early-adopter record shows the shape working. In the first 72 hours after Fable 5’s original launch, one documented builder, Rich Carr, had Fable produce a multi-phase scope of work that became the running specification for later sessions — in his words, “Each correction becomes a standing rule rather than a one-temp fix.” Another, Hans van Gent, built a /reflect skill that extracts durable facts, corrections, and workflows at session end and writes them into config so the next session starts smarter — letting a cheaper model execute against the standard the frontier model set once. A third had Fable solve a one-off API integration for an analytics dashboard; the resulting solution now refreshes every morning on a 24-hour task without the frontier model re-engaged. The beat-the-deadline genre is already its own content category — “Claude Fable 5 Is Finally Back: 5 Must-Try Use Cases Before July 7” was on YouTube within a day of the restoration.
Concretely, for a marketing team, the banking week looks like three or four sessions: one on a full quarterly content calendar, one on an SEO roadmap built the same way, and one or two on the campaign briefs and standing rules that govern everything else — the same plan-artifact structure our content engine service runs for client content operations. For the downstream economics — what execution costs per post at each tier once the plan exists — see the cost-per-post math across maturity tiers. One consultant reported billing around $1,200 a day on Fable 5-assisted delivery and called the subscription self-funding — one person’s anecdote, not a benchmark, but a useful sketch of what banked leverage looks like when it lands.
“Use it for the big jobs that earn it, and keep a lighter model for everything else.”— AI Blew My Mind, on sustainable Fable 5 use, June 2026
07 — The Counter-CaseWhen banking a plan is the wrong call.
The honest version of this framework includes its failure modes. The most instructive counter-example comes from a developer who documented burning $21 trying to prove an AI orchestrator could beat direct Fable 5 calls — and found that for some tasks, the routing overhead cost more than it saved. The lesson isn’t that routing is wrong; it’s that plan-banking is a deliberate one-time investment in a durable artifact, not a general argument for building a permanent orchestration layer. If the task is small, self-contained, and won’t recur, the cheapest correct move is usually a single direct call to whichever model fits.
The marketing-side framing of the same principle appeared the week this post published: “cheap and fast for drafts, frontier models for ambiguity, specialist tools for media, humans for judgement.” Banking only makes sense for the ambiguity bucket — work where the expensive decisions, once made, stay made.
Long-horizon, ambiguity-heavy work
Quarterly calendars, SEO roadmaps, PRDs, campaign briefs — decisions that steer months of execution and reward frontier judgment once. Write the acceptance criteria and gates in the same session.
Routine, recurring drafts
Social captions, adaptations, resizes, summaries. A cheaper model handles these adequately with or without a banked plan — spending frontier included-usage here is consumption, not investment.
Fast-moving channels
Paid social angles, trend-reactive formats, anything where the strategic picture shifts monthly. Bank a one-month horizon with an explicit review date instead of a quarter — a stale plan executed faithfully is worse than no plan.
One-off, self-contained tasks
The $21 orchestrator lesson: when a task won’t recur, routing and plan overhead can exceed the frontier premium itself. Make the single direct call and move on.
08 — ConclusionSpend the window on decisions, not drafts.
A banked plan keeps paying off after the meter starts.
July 7 is best understood as a one-time capital-allocation decision. The included window is a scarce, expiring resource; routine drafts spend it on work a cheaper model would have done adequately anyway, while plan artifacts convert it into something durable — decisions with gates that Opus 4.8 and Sonnet 5 can execute against for months at a half to a fifth of the output price.
The pattern requires no faith. Spec-driven development, GitHub’s Spec Kit, Claude Code’s Plan Mode, and the architect-loop architect/builder split are all shipped, public, and working — marketing teams are borrowing a proven structure, not betting on a new one. The only genuinely new work is translation: channel ownership instead of file ownership, pre-publish checklists instead of test commands, briefs instead of PRDs.
Looking forward, the July 7 cutover is unlikely to be the last time a frontier model moves from included to metered — the economics of frontier capacity point the same direction across every provider. Teams that learn to bank plans now are building the muscle that matters for the next window too: knowing exactly which work earns the expensive model, and having the artifact structure ready so that everything else doesn’t need it.