SYS/2026.Q1Agentic SEO audits delivered in 72 hoursSee how →
AI DevelopmentTutorial10 min readPublished May 2, 2026

Build a working code-reviewer subagent from .claude/agents/ markdown to first invocation — with a real-world example you can paste into your repo today.

Build a Claude Code Custom Subagent: Step-by-Step Guide

A walkthrough that produces a real, copy-pasteable subagent definition for a senior code-reviewer agent. Covers frontmatter, tool allowlist, system prompt structure, automatic delegation, and orchestration patterns — built from the ground up so your team can ship its first subagent today.

DA
Digital Applied Team
AI engineering · Published May 2, 2026
PublishedMay 2, 2026
Read time17 min
Sources9
Default tool model
Sonnet
balanced cost / capability
Parallel agents max
10
concurrent invocations
Token cost vs sequential
5-10×
parallel multiplier
Lines per agent file
30-80
typical range

A Claude Code custom subagent is the smallest unit of leverage in an agentic development workflow — a markdown file in .claude/agents/ that defines a focused expert with its own system prompt, tool allowlist, and invocation contract. Build it well and the main session becomes a thin orchestrator that delegates the messy work; build it badly and you get hallucinated scope, runaway tool use, and review loops that never converge.

This guide walks through the seven frontmatter fields that shape behavior, the four-section system prompt pattern that produces reliable subagents, the tool-allowlist discipline that keeps blast radius small, and the orchestration patterns — sequential, parallel, role-based, hierarchical — that compose subagents into real workflows. Everything below is grounded in the public Claude Code docs and the patterns we ship for client repos.

The worked example at the end is a real code-reviewer subagent — frontmatter, system prompt, and a session transcript you can paste into a repo and have running inside two minutes. If you take one thing away, take that: subagents are markdown, and the markdown is the whole interface.

Key takeaways
  1. 01
    Subagents are markdown — that's the whole interface.Place a .md file in .claude/agents/ (project) or ~/.claude/agents/ (user). YAML frontmatter for config; markdown body is the system prompt. No build step, no SDK, no registry.
  2. 02
    The description field decides if you get auto-invoked.Specific, keyword-rich. 'MUST BE USED for…' patterns lift priority. Vague descriptions get ignored — Claude Code picks the first agent whose description matches the user's intent.
  3. 03
    Least-privilege tools prevent expensive mistakes.Code reviewers get Read + Grep, not Bash. Writers get Read + Write + Edit. Reserve Task for orchestrators only — it's the source of delegation loops and runaway token spend.
  4. 04
    Subagents have no memory between invocations.Each invocation starts fresh — no chat history, no scratchpad. Embed prior state in the prompt, write checkpoints to disk, or use the main session as the state holder.
  5. 05
    Parallel is up to 10×, but costs 5-10× the tokens.Use parallel for time-critical reviews and exploration. Sequential for cost-sensitive batches. Hierarchical orchestrators amplify both axes — plan the spend before you spawn ten reviewers.

01AnatomyWhat's inside a Claude Code subagent.

A Claude Code subagent is a single markdown file. There is no registry, no compiled artifact, no SDK call to register it. Drop the file into the right directory and Claude Code discovers it the next time you start a session — that's the entire onboarding ceremony.

The file has two halves. The top is YAML frontmatter between two ---fences — that's where you declare the subagent's name, what it's for, which tools it can call, and which model backs it. Everything below the second --- is the markdown body, which becomes the subagent's system prompt verbatim. No preprocessing, no templating — what you write is what the agent sees.

Subagent files live in one of two locations, and the precedence order matters when you have both:

  • Project scope: .claude/agents/<name>.md in the repo root. Committed to git. Shared across the team. Wins ties when the same agent name exists in both scopes.
  • User scope: ~/.claude/agents/<name>.md in your home directory. Personal — never committed. Useful for the agents you want available in every repo (a general code-explainer, a commit-message helper) without polluting any single project.

Once Claude Code has discovered a subagent, the main session can either delegate automatically based on the agent's description field, or you can invoke it explicitly with /agents or by naming the agent in a prompt. The description field is the difference between a subagent that gets used and one that sits idle — Section 05 covers exactly how to write it.

The mental model
A subagent is a fresh Claude session with a pre-baked system prompt and a restricted toolset. It has no memory of past invocations and no access to the main session's chat history. Each call is a clean slate — which is what makes subagents predictable and what makes their input-prompt design so important.

02FrontmatterThe seven fields that shape behavior.

The YAML frontmatter is small but consequential. Only two fields are strictly required (name and description), but the others are how you tune blast radius, model cost, and discoverability. Here's a complete frontmatter block from a production code-reviewer:

---
name: code-reviewer
description: MUST BE USED for code review, PR review, security audit, and refactor suggestions on TypeScript / Next.js / Python code. Read-only — never writes files.
tools: Read, Grep, Glob, Bash
model: sonnet
context: project
autoMemoryEnabled: false
---

Each field controls a different lever. The grid below summarizes what each one does and the most common values teams pick:

Required
name
lowercase-hyphenated

Identifier used in /agents and in explicit invocation. Must match the filename (minus .md). Common names: code-reviewer, test-writer, security-auditor, refactor-architect, doc-writer.

Required · unique per scope
Required
description
1-3 sentences · keyword-rich

Drives automatic delegation. Lead with 'MUST BE USED for X' to lift priority. Name the languages, frameworks, and tasks the agent covers. Vague descriptions don't get auto-invoked.

Most important field
Recommended
tools
Comma-separated allowlist

Restrict what the agent can call. Reviewers: Read, Grep, Glob. Writers: Read, Write, Edit, Glob. Researchers: WebFetch, WebSearch, Read. Omit to inherit ALL main-session tools (rarely desirable).

Least-privilege gate
Recommended
model
sonnet · opus · haiku · inherit

Backing model for this agent. Sonnet is the default — balanced. Opus for hard reasoning (architecture review, complex refactors). Haiku for fast classification (triage, label-detection). Inherit copies the main session's model.

Cost & capability lever
Optional
context
project · user

Explicit scope marker. Mostly informational — the file's directory already determines scope. Useful for documentation and for tooling that audits agent inventories across a fleet of repos.

Self-documenting
Optional
autoMemoryEnabled
true · false

Whether the agent appends to the project's CLAUDE.md / memory. Default false for review agents (read-only). True only for agents whose job is to learn and write durable conventions — usually orchestrators, not workers.

Memory write-gate
Optional
skills, permissions
Comma-separated · skill names

Tighter coupling to project-installed skills. Permissions can grant or deny specific bash patterns ('Bash(git push:*)' denied) without touching the broader tools allowlist. Use sparingly — they're easy to forget.

Fine-grained controls

Two of these earn the loudest reactions in code review: description and tools. Get the description wrong and the agent never runs — Claude Code picks a different agent or handles the request in the main session. Get the tools wrong and you either over-grant (the reviewer suddenly writes files) or under-grant (the writer can't Read the files it needs to edit). Sections 04 and 05 drill into each one.

03System PromptRole, context discovery, output format.

The markdown body below the frontmatter is the system prompt — the instructions the agent reads at the start of every invocation. Because each invocation is a clean slate, this prompt is doing more work than a system prompt in a chat app: it has to establish identity, teach context discovery, prescribe the workflow, and pin the output format every single time.

The pattern we use across client repos has four anchors. Skip any one of them and the agent drifts. Hit all four and the agent behaves identically on invocation 1 and invocation 100.

Anchor 1
Role
Who is this agent?

Open with 'You are a [role] specializing in [domain]'. Be specific — 'You are a senior code-reviewer specializing in TypeScript and Next.js' beats 'You are a code reviewer'. The role anchors voice, depth, and what gets flagged.

1-2 sentences
Anchor 2
Context
Where does it look?

Tell the agent where ground-truth lives in this repo — CLAUDE.md, /docs, /lib, /app, the latest changed files in git. 'Before reviewing, read CLAUDE.md and the latest diff via git diff main...HEAD' gets you 80% of the way.

Discovery checklist
Anchor 3
Workflow
How does it work?

Numbered steps. 'Step 1: Read CLAUDE.md. Step 2: Run git diff. Step 3: For each changed file, check [list].' Numbered workflows survive the agent's lack of memory because they're re-read on every invocation.

Numbered sequence
Anchor 4
Output
What does it return?

Pin the output shape exactly. 'Return a markdown summary with three sections: Critical, Warnings, Suggestions. Each item references a file and line. End with one-sentence verdict.' Output discipline is the single biggest fix for hallucinated scope.

Pinned format

A tight system prompt for a code-reviewer is typically 30 to 60 lines of markdown. Longer than that and you're probably duplicating what should be in CLAUDE.md (project-level conventions belong there, not in every agent). Shorter than that and the agent either over-interprets the prompt or asks the main session for clarification — which defeats the point of delegation.

One pattern worth borrowing from the MCP world: end the system prompt with an explicit example of a good invocation and a good output. A one-shot example burned into the system prompt is a cheap way to anchor the output format without using a full few-shot demonstration in the user prompt. For deeper context on how vocabulary choices in tool definitions affect tool-use reliability, our MCP tool-use vocabulary reference covers the same idea applied to MCP servers — the lessons port directly to subagent system prompts.

04Tool AllowlistLeast-privilege by design.

The toolsfield is a security and cost gate, not a convenience knob. Default reaction is to leave it off — that gives the agent every tool the main session has — but that's usually wrong. Reviewers shouldn't write. Researchers shouldn't run bash. Orchestrators shouldn't edit code. The blast radius of a misbehaving agent is exactly the union of the tools it can call.

The standard mapping we use across client repos:

Read-only review
Code-reviewer · Security auditor

Read, Grep, Glob, Bash (read-only commands only). No Write, no Edit. Reviewers can describe the change they would make; they cannot make it. This is the safest and most-deployed pattern.

Read · Grep · Glob
Build & edit
Test-writer · Refactor-architect

Read, Write, Edit, Glob, Bash. Builders need write access by definition. Always pair with a downstream review step (sequential pattern in Section 06) — the builder writes, the reviewer audits, the human merges.

Read · Write · Edit
Research
Researcher · Doc-finder

WebFetch, WebSearch, Read, Grep. No write tools at all. Researchers gather; they don't mutate. Output is a markdown briefing the main session uses to decide what to do next.

WebFetch · WebSearch
Orchestrator
Orchestrator-of-orchestrators

Task (to spawn subagents), plus Read for context discovery. Never give orchestrators Write or Edit — they're meant to coordinate, not execute. Reserve the Task tool for these agents only to prevent delegation loops.

Task · Read

One rule worth tattooing on the back of your hand: only orchestrators get the Task tool. If a worker subagent has access to Task, it can spawn its own subagents, which can spawn their own, which is how a 5-minute review job turns into a 90-minute, $40 token-spend incident. The Claude Code docs are explicit about this — keep the delegation tree shallow and explicit.

For permissions even finer than the tools allowlist, the optional permissions frontmatter field lets you deny specific bash patterns (Bash(git push:*) denied) or specific file paths. The trade-off is maintenance — permissions rules need to be kept in sync with what the agent actually does, and they silently break workflows when they drift.

Production lesson
On a client repo in March 2026 we let a refactor-architect agent inherit the main session's full toolset, including Bash. The agent decided it could test its refactor by running git reset and re-running the build — wiping 40 minutes of uncommitted work in the orchestrator session. We now ship every agent with an explicit tools line. No exceptions.

05Automatic DelegationWhy the description field is everything.

Claude Code picks which subagent to invoke automatically by matching the user's intent against the descriptionfield of every available agent. The algorithm is not magic — it's a model judging description-versus-intent fit. So the description is the entire interface between user intent and agent invocation. Get it keyword-rich and specific, and the agent runs. Get it vague, and Claude Code does the work in the main session.

Four patterns lift descriptions out of the noise:

  • Lead with "MUST BE USED for X". The capitalized imperative is a strong priority signal. Use it for the canonical task the agent owns.
  • Name the languages, frameworks, and tasks. "TypeScript / Next.js / Python code" matches better than "modern web code". Specificity wins.
  • State the boundary."Read-only — never writes files" is both a behavior contract and a routing hint. The model uses it when picking between competing agents.
  • Cover one job, not five. An agent that advertises code review AND test writing AND documentation will lose to specialized agents on each of those tasks. Split it.

When two agents have overlapping descriptions, Claude Code picks one — usually the one whose description is more specific to the current task. Project-scope agents (the ones in your repo's .claude/agents/) beat user-scope agents (the ones in ~/.claude/agents/) when both define an agent with the same name. Use that precedence: the project file is the team standard, the user file is your personal fallback.

"The description field is the entire interface between user intent and agent invocation. Treat it like the name of a public API method — specific, stable, advertised."— Production lesson · Digital Applied subagent kit

Two failure modes to watch for. First, the silent skip: a user asks for code review, the agent never runs, and the main session answers in-line — usually because the description was too vague. Fix by adding the "MUST BE USED for code review" anchor. Second, the wrong-agent invocation: the user wanted a security audit but the code-reviewer ran instead. Fix by making the security-auditor's description name the threat categories explicitly ("OWASP top 10, secret detection, RBAC review") so its match score beats the generic code-reviewer.

06Orchestration PatternsSequential, parallel, role-based, hierarchical.

One subagent is useful. Five subagents wired into a pattern is leverage. The four patterns below cover almost every workflow we've seen in production — the choice is mostly about whether order matters and whether you're optimizing for wall-clock time or token spend.

Sequential is the default. The orchestrator invokes agent A, waits for its output, feeds the relevant slice into agent B, waits again, hands off to agent C. The classic review-test-deploy chain. Wall-clock time is the sum of agent latencies; token cost stays close to baseline.

Parallelspawns multiple agents at once — up to ten concurrent invocations. The orchestrator collects their outputs and synthesizes. Wall-clock time drops to the slowest agent's latency; token cost scales with the number of parallel agents because every agent re-reads the system prompt and the relevant context.

Role-based is parallel with explicit personas — send the same code to a security reviewer, a performance reviewer, and a maintainability reviewer simultaneously. The orchestrator merges their outputs into one verdict. Useful when you want orthogonal perspectives, not redundant ones.

Hierarchical is the orchestrator-of-orchestrators pattern. A top-level agent delegates to a mid-level orchestrator, which spawns workers. Powerful but easy to get wrong — each layer adds latency and token spend, and a misbehaving worker at depth 3 is hard to debug. Reserve for genuinely multi-stage workflows (multi-repo audits, cross-cutting refactors).

Approximate token-cost ratio · sequential vs parallel orchestration

Token-cost multipliers are illustrative — actual spend depends on agent system prompt size, context loaded, and output token count.
1 agent (sequential)Baseline · single invocation · sum of agent latencies
5 parallel agentsTime drops to slowest agent's latency · context replicated 5×
~5×
10 parallel agentsMax concurrent · context replicated 10× · synthesis step adds tokens
~10×

The trade-off curve is worth internalizing. If wall-clock time matters (interactive review during a coding session, time-critical incident triage), parallel earns its 5-10× token premium. If cost matters (overnight repo audits, bulk refactor passes), sequential keeps the bill predictable. The wrong answer is to use parallel by default — it's the most-common subagent cost pathology we see in the wild.

For more on how Claude Code's agentic architecture supports these orchestration patterns and what the broader research literature suggests, our deep-dive on the Claude Code agentic architecture lessons unpacks the underlying primitives. The patterns above are the applied layer; that piece is the theory layer.

07Worked ExampleA code-reviewer subagent — paste-ready.

The block below is a complete, production-grade .claude/agents/code-reviewer.md. Frontmatter on top, four-anchor system prompt below. Drop it into the .claude/agents/ directory of any TypeScript / Next.js repo, restart your Claude Code session, and ask Claude to review your latest changes. The agent will be auto-invoked.

---
name: code-reviewer
description: MUST BE USED for code review, PR review, security audit, and refactor suggestions on TypeScript / Next.js / Python code. Read-only — never writes files. Returns a markdown summary with Critical / Warnings / Suggestions sections.
tools: Read, Grep, Glob, Bash
model: sonnet
context: project
autoMemoryEnabled: false
---

# Role

You are a senior code-reviewer specializing in TypeScript, Next.js (App Router), and Python. You review code for correctness, security, performance, maintainability, and adherence to project conventions. You do not write or edit files — you describe the changes a human or a separate builder agent should make.

# Context discovery

Before reviewing any code, in this exact order:

1. Read CLAUDE.md at the repo root. Project conventions there override everything below.
2. Run `git diff main...HEAD` (or the configured base branch) to identify the changes in scope.
3. For each changed file, read the file in full plus the immediate neighbors it imports from.
4. If there is a relevant docs/ entry for the touched feature, read it.

If any of the above is missing or unclear, return a one-paragraph "Context insufficient" response naming what you would need to proceed.

# Workflow

For every changed file:

1. **Correctness.** Does the code do what the PR description / commit message claims? Any obvious bugs, off-by-one errors, unhandled error paths, race conditions?
2. **Security.** Any new untrusted input handled without validation? Any secrets logged, leaked, or hard-coded? Any SQL / NoSQL / shell injection paths? Any auth checks missed on a new route?
3. **Performance.** Any N+1 queries, blocking I/O on a hot path, unbounded loops, missing indexes, missing caching, over-fetching?
4. **Maintainability.** Any duplicated code that should be extracted? Any function over ~50 lines that should be split? Any unclear naming, missing types, missing tests?
5. **Project conventions.** Anything inconsistent with CLAUDE.md? With the existing patterns in the directory?

# Output format

Return a single markdown document with exactly four sections:

## Critical
Issues that must be fixed before merge. Bug, security flaw, data-loss path, breaking change. For each: file path, line number, what is wrong, what to do instead.

## Warnings
Issues that should be fixed before merge but aren't blockers. Performance smells, missing tests, naming, minor maintainability concerns. Same format as Critical.

## Suggestions
Optional improvements. Refactors, alternative approaches, library upgrades. Same format.

## Verdict
One sentence. "Approve" / "Approve with comments" / "Request changes — see Critical".

# Hard rules

- Never write or edit files. If you find a bug, describe the fix; do not implement it.
- Reference file paths and line numbers for every flagged issue.
- Never speculate about code you have not read. If you skipped a file, say so.
- Keep total output under 1500 words. Concision is part of the value.

And here's what a session looks like once that file is in place. The user asks for review in plain English; Claude Code detects the match against the agent's description and delegates without an explicit slash command:

# User
Review the changes on this branch.

# Claude Code (main session)
I'll delegate to the code-reviewer subagent — it matches this request
and has the right toolset (Read, Grep, Glob, Bash).

[Subagent invocation: code-reviewer]
  → reading CLAUDE.md
  → git diff main...HEAD
  → reading app/(marketing)/blog/<slug>/page.tsx
  → reading lib/blog-posts.ts
  → returning structured review

# code-reviewer output

## Critical
None.

## Warnings
- app/(marketing)/blog/<slug>/page.tsx:147 — `<em>` emphasis used inside
  `BlogTocItem.label` would not render (label is rendered as plain text).
  Strip the wrapping span.
- lib/blog-posts.ts:381 — contentSnippet is 612 chars; schema gate fails
  at >600. Trim to ~580.

## Suggestions
- The lede paragraph repeats "agentic" three times in three sentences.
  Vary the second use.

## Verdict
Approve with comments — address both Warnings before merge.

# Claude Code (main session)
Two Warnings flagged. Want me to apply both fixes, or pull up the
diffs for you to apply manually?

Two things to notice. The subagent never wrote a file — it described the fixes and handed the verdict back. The main session (which can write files) then offered to apply the changes. That hand-off is the whole point of least-privilege tools: the reviewer reviews, the orchestrator decides, the human stays in the loop.

Start with this agent. Add a test-writernext (Read + Write + Edit + Bash, scoped to the test directories). Then layer in a security-auditor with a parallel orchestration pattern for PR-time review. Within a week you'll know whether subagents fit your team's flow. If you want the same calibration applied to your repos and review gates, our AI transformation engagements ship custom subagent kits — code review, test generation, deployment guards — tuned to your stack.

For broader context on how subagents compare to other agentic tooling, see our roundup of AI coding agents — Claude Code, Cursor, Codex, Replit. And if you're interested in building tooling that subagents can call rather than just defining the subagents themselves, the TypeScript MCP server development guide covers the next layer down.

Conclusion

Subagents are the smallest unit of leverage in Claude Code.

The interface is markdown. The contract is the frontmatter plus a four-anchor system prompt. The delegation is driven by the description field. The blast radius is bounded by the tool allowlist. None of those primitives are exotic — what's exotic is how much leverage falls out of getting all four right on a single agent and then composing five of them.

Where the leverage compounds: once your team has five to ten well-described subagents — reviewer, test-writer, refactor-architect, security-auditor, doc-writer, deployment-guard — the main session stops doing the actual work and becomes a thin orchestrator. The human in the loop reviews orchestration decisions, not line-level edits. That is the agentic development workflow most teams want and almost none have actually built.

Practical next step: start with a code-reviewer (lowest risk — read-only, immediate value on every PR). Add a test-writer once you trust the review loop. Then introduce one orchestrator pattern — a sequential review-then-test chain is the easiest first step. You'll know within a week whether subagents fit your team's flow. If they do, the next 5-10 agents take days, not weeks.

Build your team's subagent kit

Custom subagents are the fastest way to scale your team's review and build workflows.

Our agentic AI engineering team designs and ships custom Claude Code subagent kits — code review, security audit, test generation, deployment guards — calibrated to your repos and your team's gates.

Free consultationExpert guidanceTailored solutions
What we ship

Subagent engagements

  • Bespoke subagent kit calibrated to your repo and review gates
  • Orchestrator agents for multi-stage workflows (review → test → deploy)
  • Hook-integrated subagents firing on Stop / SubagentStop events
  • Skill + subagent combinations for compounded leverage
  • Team enablement and rollout — adoption playbook included
FAQ · Subagent guide

The questions teams ask before shipping their first subagent.

A Claude Code subagent is a markdown file in .claude/agents/ (project) or ~/.claude/agents/ (user) that defines a focused expert with its own system prompt, tool allowlist, and backing model. Each invocation runs as a fresh Claude session with the subagent's pre-baked prompt — no memory of prior calls, no access to the main session's chat history. Use one when a task is repeated enough to justify the setup cost, when a sub-task benefits from a specialized system prompt (code review, security audit, test writing), or when you want to restrict tool access for a class of operation (read-only reviewers, no-bash researchers). One-off tasks usually don't warrant a subagent — the friction of defining one outweighs the leverage.