Running Claude Fable 5 as the planner brain — the model that plans and judges while cheaper models execute — is the single highest leverage config change you can make in Hermes Agent or OpenClaw this month. Fable 5 was restored for all customers on July 1, 2026 after its June 12 export-control suspension, and Anthropic has announced that included plan usage ends July 7, with metered usage credits at standard API rates from July 8.

At $10 per million input tokens and $50 per million output — twice Opus 4.8’s $5/$25 — leaving Fable 5 as the default model for every tool call in a 24/7 agent is the expensive way to run it. The cheap way is architectural: Fable 5 touches only the plan and the review pass, and the execution loop runs on Opus 4.8 or something cheaper still. Both major open-source agent runtimes have a documented config surface for exactly this split.

This playbook covers the two frameworks’ planner/executor architectures, the exact keys in Hermes Agent’s config.yaml and OpenClaw’s openclaw.json, a side-by-side config table nobody else has published, the worked cost math, and the security hardening that has to come before you hand a frontier planner shell access.

Key takeaways

01
Run Fable 5 as planner and judge only.Fable 5 lists at $10/$50 per million tokens — twice Opus 4.8’s $5/$25. Route execution to cheaper models and the expensive model’s token footprint per task shrinks while the agent keeps running unattended.
02
Each framework has one config surface for the split.Hermes Agent: the primary model plus fallback_provider in ~/.hermes/config.yaml. OpenClaw: agents.defaults.subagents.model in ~/.openclaw/openclaw.json, with per-agent and per-call overrides. Both are documented by the vendors.
03
The worked math: $4.50 vs $2.25 per task.A 200K-input / 50K-output task costs ≈$4.50 on Fable 5 and ≈$2.25 on Opus 4.8 at list rates. Prompt caching takes 90% off cached input, dropping the input share of that task toward ~$0.20 on cache hits.
04
Two fallbacks exist — keep them straight.Anthropic’s safeguard classifier reroutes sensitive queries to Opus 4.8 automatically, outside your control, on under 5% of sessions. Your config-level fallback_provider or fallbacks[] is a separate, user-controlled mechanism for outages and cost.
05
Harden the skill marketplace before granting autonomy.An early-2026 audit of ClawHub found 341 of 2,857 published skills malicious (~12%), and CVE-2026-25253 (CVSS 8.8) was a one-click RCE against OpenClaw. Vet skills and isolate the runtime before the planner gets shell access.

01 — Why NowThe planner-brain pattern, and the July 8 reason to care.

Fable 5 shipped on June 9, 2026 as Anthropic’s flagship for long-horizon agentic work — planning across stages, delegating to sub-agents, running for hours while validating its own output. That job description is precisely why it belongs at the top of an agent stack rather than inside the execution loop: the skills that justify its price are planning, delegation, and judgment, not the ten thousand routine tool calls in between.

The economics turned urgent this week. After the June 12 export-control suspension, Fable 5 came back for all customers on July 1 — selectable again in Claude, the Anthropic API, and partner platforms including AWS Bedrock, Google Cloud Vertex AI, and Microsoft Foundry — with pricing unchanged at $10/$50 per million tokens. Anthropic’s announced schedule gives subscribers an included window through July 7 (up to 50% of weekly plan limits), after which usage meters as credits at standard API rates. The plan-by-plan detail is in our Fable 5 usage-credits pricing guide. From July 8, every token your agent burns on Fable 5 is a token you pay list price for.

The planner-brain pattern is the standing answer. Fable 5 reads the task, writes the plan, and reviews the result; a cheaper model — Opus 4.8 at half the rate, or something smaller — does the fetching, editing, testing, and retrying. The agent’s capability ceiling stays where the frontier model sets it, while the metered bill tracks the cheap model’s rate for the bulk of the tokens.

The announced schedule

Fable 5 inclusion in paid plans runs through July 7, 2026. Metered usage credits at standard API rates begin July 8, per Anthropic’s redeployment announcements. How credits convert beyond “standard API rates” has not been published in detail — treat any specific conversion figure you see elsewhere as unconfirmed. What is confirmed: $10/$50 per million tokens, unchanged through the restoration.

02 — ArchitecturesTwo frameworks, two shapes of planner and executor.

Hermes Agent and OpenClaw are the two open-source runtimes where this pattern is most commonly wired up, and they model “planner plus executor” differently — which is why the config differs. MindStudio’s comparison frames it as agents built for modularity and composability versus agents built for simplicity and quick deployment. Hermes runs multi-agent orchestration: orchestrator agents spawn or call specialist agents and pass structured results between them. OpenClaw runs a single-agent plan-execute-reflect loop: it generates a task plan, executes steps, and reviews its own results before proceeding.

Multi-agent orchestration

Hermes Agent

Python · MIT license · NousResearch

Orchestrator agents spawn specialist agents and pass structured results between them. The planner split is session-level: set the primary model, and use fallback_provider for cost or outage routing. Reached #1 across AI applications on OpenRouter’s global rankings on May 10, 2026.

github.com/nousresearch/hermes-agent

Plan-execute-reflect loop

OpenClaw

TypeScript · created Nov 24, 2025

A single agent generates a task plan, executes steps, and reviews its own results before proceeding. The planner split is explicit config: model.primary for the brain, agents.defaults.subagents.model for the hands.

github.com/openclaw/openclaw

Both are large, actively maintained projects, not fringe tools — our July 2 GitHub snapshot showed roughly 207,600 stars on hermes-agent and 381,400 on openclaw (counts move daily; treat them as a snapshot, not a fixed claim). Momentum stories abound: NetworkChuck, a YouTuber with around five million subscribers, announced in May that after a month of use he was moving all of his OpenClaw agents to Hermes. Neither project is an Anthropic product, and Anthropic endorses neither — both are third-party runtimes that support the Anthropic API as one of several providers.

Which shape you want depends on the work. If your tasks decompose into specialist roles — a researcher, a coder, a reviewer — the Hermes orchestration model maps naturally. If your tasks are linear jobs with self-review, OpenClaw’s loop is simpler to reason about, and its subagents.model key gives you the cleanest literal expression of “the model that plans” versus “the model that executes” in any config file we’ve seen.

03 — Hermes SetupHermes: config.yaml routing and a memory that compounds.

Hermes Agent installs with a single command — curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash — which provisions uv, Python 3.11, Node.js, ripgrep, and ffmpeg, and stores all state under ~/.hermes/. Hermes works with any OpenAI-compatible endpoint and treats Anthropic as a first-class provider, so wiring Fable 5 in as the brain is a config edit, not a plugin hunt: set the primary model to claude-fable-5 in ~/.hermes/config.yaml.

The cost-aware routing lever is the fallback_provider key in the same file. It serves two jobs: resilience — a local Ollama or vLLM model as an offline fallback if the Anthropic API is unreachable — and cost control, for example Fable 5 primary with Opus 4.8 (claude-opus-4-8) as the fallback. Note what Hermes does not document: a per-subagent model key. The routing surface is session-level, which fits its orchestration architecture — you swap the whole session’s model rather than splitting one session across two models. If you’re coming from OpenClaw, hermes claw migrate imports settings, memories, skills, and API keys in one command.

What makes the planner investment compound is Hermes’s persistence. It keeps a four-layer memory: a curated MEMORY.md for environment facts and a USER.md for preferences, both loaded into the system prompt at session start, plus a SQLite archive with full-text search and a skills directory. And after a complex task — roughly five or more tool calls — Hermes writes a reusable SKILL.md document compatible with the agentskills.io open standard, so the plan Fable 5 derived once doesn’t get re-derived (and re-billed) next time. Automation entry points: /schedule for cron-style jobs, and hermes gateway setup then hermes gateway install for Telegram, Discord, or Slack. For the full walkthrough beyond planner routing, see our complete Hermes Agent setup guide and the latest Hermes desktop release notes.

Persistent memory

MEMORY.md → SQLite

4layers

MEMORY.md environment facts and USER.md preferences load into the system prompt at session start; a SQLite archive with full-text search and a skills directory sit underneath. All state lives in ~/.hermes/.

Loaded every session

Self-improvement

Tasks become SKILL.md

5+calls

After a complex task of roughly five or more tool calls, Hermes writes a reusable SKILL.md compatible with the agentskills.io standard — the planner’s approach is captured instead of re-derived.

agentskills.io standard

Traction

OpenRouter, May 10, 2026

Hermes hit #1 across all AI applications on OpenRouter’s global rankings roughly 90 days after its February 2026 launch. Star counts (≈207,600 at our July 2 snapshot) move daily.

≈90 days post-launch

"Fable 5 supplies the reasoning. Hermes supplies the loop, memory, tools, and persistence. The combination is a self-improving agent that holds context across days of work and runs on the strongest publicly available coding model."— Lushbinary editorial team, Claude Fable 5 + Hermes Agent Setup Guide

04 — OpenClaw SetupOpenClaw: subagents.model is the whole trick.

OpenClaw’s config lives at ~/.openclaw/openclaw.json (or ~/.clawdbot/clawdbot.json for installs from the older Clawdbot npm package), edited via openclaw config edit or at the path openclaw doctor prints. The planner/executor split is a first-class config concept: model.primary is the model that plans, and agents.defaults.subagents.model is the global default for the sub-agents that execute. Two override levels sit on top — per-agent at agents.list[].subagents.model, and per-call via the sessions_spawn model parameter — so a single deployment can run Fable 5 planning with Opus 4.8 execution as the default, and still pin one specific agent or one specific spawn to a different model.

Two Fable-5-specific behaviors from OpenClaw’s own Anthropic provider docs are worth knowing before you wire it in. First, OpenClaw omits custom temperature values on Fable 5 requests. Second, Fable 5 always uses adaptive thinking and defaults to high effort — /think off and /think minimal are remapped to low effort rather than disabling thinking, because Anthropic doesn’t allow thinking to be fully disabled on this model. Budget for reasoning tokens accordingly: the planner will think, whether or not you asked it to.

Cost routing extends past sub-agents. A heartbeat.model key routes OpenClaw’s periodic are-you-still-there checks — default every 30 minutes — to a cheap model, and a fallbacks[] array handles outage failover. One sourcing caveat we should be transparent about: the heartbeat and fallbacks key pattern comes from VelvetShark’s multi-model routing guide, published February 2026 — before Fable 5 existed — so take the key shapes from it, not the model names or prices, and substitute claude-fable-5, claude-opus-4-8, and current rates. New to the framework entirely? Start with our full OpenClaw setup and skills walkthrough.

Config hygiene

The most common mistake in shipped OpenClaw configs is setting only model.primary and assuming sub-agents inherit something sensible. They inherit the primary — which means every executor tool call bills at Fable 5’s $10/$50 rate until agents.defaults.subagents.model says otherwise. Set the split explicitly, then verify with a spot-check of per-model token usage after the first day of running.

05 — The Reference TableThe planner-brain config, side by side.

No single source puts the two frameworks’ planner/executor syntax in one place with Fable 5 as the specific model being routed — which is exactly what you need when you’re deciding where to run the pattern. The table below compiles the Hermes column from the Lushbinary setup guide and the OpenClaw column from OpenClaw’s official provider and sub-agent docs, with the heartbeat/fallbacks key pattern from VelvetShark’s routing guide, all retrieved July 2, 2026.

Planner-brain configuration side by side — for each setting, the Hermes Agent syntax and the OpenClaw syntax for running Claude Fable 5 as the planner with cheaper models executing. Compiled from the Lushbinary setup guide, OpenClaw’s official docs, and VelvetShark’s routing guide, retrieved July 2, 2026.
Setting	Hermes Agent	OpenClaw
Model routing
Config file	`~/.hermes/config.yaml`	`~/.openclaw/openclaw.json` (legacy installs: `~/.clawdbot/clawdbot.json`); edit via `openclaw config edit`
Fable 5 as planner	Set the primary model to `claude-fable-5` — Hermes treats Anthropic as a first-class provider	Point `model.primary` at `claude-fable-5` in the Anthropic provider block
Cheaper model for execution	No per-subagent model key is documented — routing is session-level, via the primary model plus `fallback_provider`	`agents.defaults.subagents.model` globally; per-agent via `agents.list[].subagents.model`; per-call via the `sessions_spawn` `model` parameter
Resilience and operations
Fallback on outage or cost	`fallback_provider` in `config.yaml` — e.g. Opus 4.8, or a local Ollama/vLLM model if the Anthropic API is unreachable	`fallbacks[]` array (config-key pattern; substitute current model IDs)
Background pings	No heartbeat concept documented in the sources we reviewed	`heartbeat.model` routes the periodic check (default every 30 minutes) to a cheap model
Scheduling and channels	`/schedule` for cron-style jobs; `hermes gateway setup` then `hermes gateway install` for Telegram/Discord/Slack	Not covered in the sources this guide draws on
Migration path	`hermes claw migrate` imports OpenClaw settings, memories, skills, and API keys in one command	n/a — OpenClaw is the migration source here
Security surface
Skills ecosystem	Self-written `SKILL.md` documents, compatible with the agentskills.io open standard	ClawHub marketplace — an early-2026 audit found 341 of 2,857 published skills malicious (~12%); vet before installing

Read the third row twice — it’s the architectural fork. OpenClaw gives you a literal key that separates the planning model from the executing model inside one session. Hermes doesn’t, by design: its answer to “cheaper execution” is orchestration — spawn specialist agents and route between providers at the session boundary — plus fallback_provider for the cost and outage cases. If your primary goal is the tightest possible Fable-5-plans / cheap-model-executes split with minimal moving parts, OpenClaw’s config expresses it more directly. If you want the planner’s work to persist and compound across days, Hermes’s memory and skills layers are the stronger foundation.

06 — Cost MathWhat the split actually saves.

Anchor the decision in one worked example, at each model’s list price. A single agentic task consuming 200,000 input tokens and 50,000 output tokens costs about $4.50 on Fable 5 and about $2.25 on Opus 4.8 — the identical task, run twice, at $10/$50 versus $5/$25 per million tokens. Every derived cell below recomputes from those rates.

Worked cost example for one agentic task of 200,000 input and 50,000 output tokens — line items for Claude Fable 5 at $10/$50 per million tokens versus Claude Opus 4.8 at $5/$25, plus the prompt-caching variant at 90% off cached input. Rates from Anthropic’s published pricing via the Lushbinary setup guide, retrieved July 2, 2026.
Line item	Fable 5 ($10 / $50)	Opus 4.8 ($5 / $25)	How it’s computed
Input · 200K tokens	$2.00	$1.00	0.2M × $10/M vs 0.2M × $5/M
Output · 50K tokens	$2.50	$1.25	0.05M × $50/M vs 0.05M × $25/M
Task total, list price	$4.50	$2.25	Opus 4.8 runs the identical task at half the cost
Input on a full cache hit	≈$0.20	—	90% off cached input: 0.2M × $1/M (shown for Fable 5, the metered model)
Task total with cached input	≈$2.70	—	≈$0.20 cached input + $2.50 output

One agentic task · three ways to pay for it

Source: Anthropic published rates via Lushbinary setup guide, July 2026 — recomputed per line

Fable 5, everything200K in / 50K out · list price

$4.50

Fable 5 with cached input90% off cached input tokens

≈$2.70

Opus 4.8, same taskhalf the list rate on both sides

$2.25

Two levers stack on top of the routing split. Prompt caching is the first: Anthropic’s discount is 90% off cached input tokens, so in a long session that reuses a large system prompt or codebase, the input portion of that $4.50 task drops toward ~$0.20 on cache hits — $1 per million cached-input tokens against the $10 list rate. The general technique is covered in our prompt-caching engineering guide. The second lever is the pattern itself: in a planner-brain setup, Fable 5's share of tokens is the plan and the review pass, not the whole loop, so the $10/$50 rate applies to a small slice of the task while the executor’s cheaper rate covers the bulk. How much smaller that slice is depends entirely on your workload — we deliberately won’t invent a universal percentage.

Our forward read: as metering starts July 8, expect planner/executor splits to move from enthusiast trick to default posture in agent deployments, the way multi-tier model routing already did in production API stacks through 2025. The frameworks have made the config trivial; the remaining work is measurement — per-model token accounting per task, so you can see the split paying for itself. This is the kind of cost-and-architecture decision our AI transformation engagements exist to pressure-test before it hits your invoice.

07 — Routing GotchaTwo fallbacks, not one — don’t conflate them.

There are two distinct mechanisms that can hand your query to a different model, and debugging routing gets miserable if you mix them up. The first is yours: the config-level fallback_provider (Hermes) or fallbacks[] (OpenClaw) — user-controlled, for outages and cost, and it does exactly what you configured. The second is Anthropic’s: a safeguard classifier, applied server-side since the July 1 restoration, that reroutes a query to Opus 4.8 automatically when it lands in cybersecurity, biology, chemistry, or model-distillation territory. As the Lushbinary guide puts it: “For the vast majority of coding, automation, and research work the safeguard fallback never fires. If your agent operates near security or life-sciences topics, expect some responses to come from Opus 4.8, and budget for the fact that you may be paying the Fable 5 rate while receiving an Opus 4.8 answer on those specific turns.”

The classifier is automatic, outside your control, and fires on under 5% of sessions per Anthropic’s relaunch documentation. For most planner-brain deployments it’s a rounding error; for agents that touch security tooling — a pentest triage bot, a dependency CVE analyst — it’s a real line item and a real behavior change to test for. One more restoration footnote for compliance-minded teams: the mandatory 30-day data-retention requirement still applies to Fable 5 traffic — no zero-data-retention exemption for Mythos-class models — and our 30-day retention explainer covers what that means for enterprise agreements. Only the ZDR carve-out changed; this is not “everyone’s data is now retained.”

Keep them straight

Config fallback: you choose when it fires (outage, budget), and logs show the model you configured. Safeguard classifier: Anthropic chooses, per-query, in sensitive domains — and per the Lushbinary guide you may pay the Fable 5 rate on turns where an Opus 4.8 answer comes back. If your agent’s outputs suddenly read differently on security-adjacent tasks, check which mechanism moved before you touch your config.

08 — SecurityHarden before you grant autonomy.

Here’s the connection the setup guides don’t make: wiring Fable 5 in as the planner means giving a frontier model shell access and letting it run unattended — on a runtime whose skill ecosystem was, months ago, measurably compromised. In early 2026 an independent audit of all 2,857 skills published on ClawHub, OpenClaw’s marketplace, found 341 confirmed malicious — roughly 12% — with about 335 traced to a single coordinated campaign tracked as ClawHavoc. Separately, CVE-2026-25253, rated CVSS 8.8, was a one-click remote-code-execution chain exploitable even against localhost-bound OpenClaw instances (patched in v2026.1.29), and scanning teams including Censys, Bitsight, and Hunt.io identified 30,000+ internet-exposed OpenClaw instances, many with no authentication.

The vendor response was real: on March 27, 2026, OpenClawd (the managed-hosting vendor) shipped automated skill vetting — static analysis plus behavioral testing before a skill activates — verified installer sourcing, and runtime sandboxing that auto-blocks skills flagged for network exfiltration, prompt injection, or credential exposure. Treat that as the floor, not the ceiling. Before your planner gets autonomy: run the latest patched version, don’t expose the instance to the internet, vet every skill as if the audit numbers were current, and isolate the runtime from your real credentials. The step-by-step version is our OpenClaw hardening guide — read it before the first unattended run, not after. To be fair to both sides of the table: these incidents are OpenClaw/ClawHub-specific; Hermes has its own agentskills.io skill ecosystem, which was not documented as compromised in the sources we reviewed.

"By default the agent runs as your user with access to your home directory, SSH keys, and any cloud credentials on the box. Prompt injection from a fetched web page or a file in the repo can turn a benign task into `rm -rf` or a key exfiltration attempt. Isolation is not optional for an autonomous agent."— Lushbinary editorial team, Claude Fable 5 + Hermes Agent Setup Guide

Researcher corroboration

Unit 42, Palo Alto Networks’ threat-research arm, documented the ClawHub incident wave as an emerging AI supply-chain threat — malicious skills as the new malicious packages. The uncomfortable multiplier in a planner-brain setup is capability: the stronger the model you hand to a compromised skill, the more competently a hijacked session pursues the attacker’s goal. Marketplace hygiene isn’t adjacent to this playbook; it’s a prerequisite for it.

09 — ConclusionOne config key between you and a halved agent bill.

The planner-brain posture, July 2026

Put the expensive model where judgment lives, and meter everything else.

The pattern is small enough to ship this afternoon. In Hermes: primary model claude-fable-5, cost and outage routing via fallback_provider, and let the memory and skills layers turn each expensive plan into a reusable asset. In OpenClaw: model.primary for the brain, agents.defaults.subagents.model for the hands, and heartbeat.model so background pings never touch the metered model.

The math holds at any scale that matters: the same task at $4.50 on Fable 5 or $2.25 on Opus 4.8, with caching pulling the cached input share down by 90% — and the planner split shrinking the expensive model’s footprint to the plan and the review pass. With included usage ending July 7 and metering starting July 8, the teams that set this split now will barely notice the cliff; everyone else finds out from an invoice.

And the order of operations is non-negotiable: harden first, autonomy second. A marketplace that measured ~12% malicious, a CVSS 8.8 one-click RCE, and 30,000+ exposed instances are not reasons to skip the pattern — they’re reasons to treat isolation and skill vetting as step zero of it. The planner brain is only as trustworthy as the body you give it.

Fable 5 as the Planner Brain in Hermes and OpenClaw

01 — Why NowThe planner-brain pattern, and the July 8 reason to care.

02 — ArchitecturesTwo frameworks, two shapes of planner and executor.

Hermes Agent

OpenClaw

03 — Hermes SetupHermes: config.yaml routing and a memory that compounds.

MEMORY.md → SQLite

Tasks become SKILL.md

OpenRouter, May 10, 2026

04 — OpenClaw SetupOpenClaw: subagents.model is the whole trick.

05 — The Reference TableThe planner-brain config, side by side.

06 — Cost MathWhat the split actually saves.

One agentic task · three ways to pay for it

07 — Routing GotchaTwo fallbacks, not one — don’t conflate them.

08 — SecurityHarden before you grant autonomy.

09 — ConclusionOne config key between you and a halved agent bill.

Put the expensive model where judgment lives, and meter everything else.

The strongest planner is only worth it when cheaper models do the running.

Agent-stack engagements

The questions we get every week.

Continue exploring agent stacks.

Fable 5 Before July 7: The Six-Day Window Playbook

10 Fable 5 Prompts to Upgrade Your AI Agent Setup 2026

Fable 5 Cost Engineering: Cache, Batch and Spend Caps

Autonomous AI Agents 2026: From OpenClaw to MoltBook

AI Agent Memory 2026: Vector, Graph, Episodic Update

AI Agent Governance: Policy and Compliance 2026 Guide