Agent architecture vocabulary stabilized in 2024-2026 around eight canonical patterns. The patterns span four quadrants — single-agent, collaborative multi-agent, competitive multi-agent, and orchestration topology — and most production agent systems are compositions of two or three from across those quadrants.

This taxonomy reference walks the eight patterns with reference papers, when-to-use guidance, failure modes, and framework support across LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, and Microsoft Agent Framework.

Use it as the architecture-decision document when you scope a new agent project. Most teams over-engineer toward multi-agent topologies before single-agent reaches its quality ceiling. The taxonomy clarifies when escalation is warranted.

Key takeaways

01
Start single-agent. Escalate to multi-agent only when single-agent caps out on a measured quality dimension.Most teams add coordination overhead before single-agent reaches its ceiling. Build a single-agent baseline, measure failure modes, then add multi-agent topology if the failure mode is decomposable.
02
Eight patterns cover ~95% of production agent systems. Anything beyond is composition or domain specialization.The four quadrants × two patterns each give a complete vocabulary for architecture conversations. New agent designs almost always reduce to a composition of these eight.
03
Hierarchical (supervisor-worker) and graph topologies are the two multi-agent patterns that earn their cost in production.Swarm and blackboard patterns are theoretically interesting but rarely outperform hierarchical or graph in practice. Default to one of those two when going multi-agent.
04
ReAct is still the right single-agent default. Reflexion is the right add-on when failure modes repeat.ReAct + Reflexion is the production-grade single-agent stack. Add plan-and-execute when planning is the bottleneck; add verifier-critic when output quality is the bottleneck.
05
Framework choice matters less than pattern fit. LangGraph, AutoGen, CrewAI, and OpenAI Agents SDK all support the canonical patterns.Pick by team familiarity and ops integration, not by exclusive pattern support. The patterns map across frameworks; rebuilding to switch frameworks is rarely justified.

01 — Quadrant 01Single-agent patterns.

One model orchestrating its own loop. The default starting point for agent design.

Pattern 1: ReAct (Reasoning + Acting). Yao et al. (2022). The model alternates a "thought" trace with an "action" call, using each observation to update its next thought. The canonical agent loop and still the right default for general-purpose agents.

When to use: general agent tasks where reasoning and tool use interleave naturally; tasks under ~30 steps without complex coordination.

Failure modes: long-horizon coherence loss past ~50 steps; repeated identical mistakes when no reflection mechanism is added; cache invalidation in long runs.

Mitigations: add Reflexion for repeat-failure issues; add re-anchor checkpoints at ~40 steps to manage cache invalidation.

Pattern 2: Reflexion. Shinn et al. (2023). ReAct plus an explicit self-critique step after each iteration; the critique is appended to context as guidance for the next attempt. Reduces repeated failure modes on long-horizon coding and reasoning tasks.

When to use: tasks where the model has been observed to repeat the same mistake across attempts; coding and math reasoning workloads.

Failure modes: latency tax (extra model call per step); over-correction can cause oscillation; critique quality degrades on subjective tasks.

Mitigations: bound the critique-revise cycle to 2-3 attempts; pair with rubric-based critique for subjective tasks.

Default

ReAct

thought → action → observation, repeat

The single-agent default. Strong on reasoning + tool use; weak on long-horizon without re-anchoring.

Yao 2022

Reliability

Reflexion

ReAct + post-step self-critique

Reduces repeated failure modes by 30-50% on coding and math. Pairs naturally with ReAct.

Shinn 2023

02 — Quadrant 02Multi-agent collaborative.

Multiple agents working together toward shared goal. Adds coordination overhead in exchange for parallelism or specialization.

Pattern 3: Plan-and-execute. Two-phase loop: planner agent emits an ordered plan; executor agent (often a different, cheaper model) walks the plan one step at a time. Cheaper at scale; brittle when plans need mid-run adaptation.

When to use: predictable workflows with known structure; tasks where planning time can be amortized across many similar runs.

Failure modes: plan brittleness when the world changes mid-execution; planner-executor capability mismatch; over-optimization for plan completion versus user goal.

Mitigations: add re-plan triggers on execution failures; use planner with stronger reasoning model than executor; bound plan length.

Pattern 4: Supervisor-worker. Hierarchical multi-agent. Supervisor agent decomposes tasks into sub-tasks routed to specialized worker sub-agents. Each worker runs its own loop; supervisor aggregates and decides whether the overall task is complete.

When to use: tasks decomposable into clear sub-tasks (research with topic-specific subagents, multi-domain reasoning); workloads benefiting from worker specialization.

Failure modes: coordination overhead can dominate simple tasks; supervisor drift; sub-task conflicts not surfaced to supervisor.

Mitigations: set explicit interface contracts between supervisor and workers; require structured outputs from workers; bound sub-task budgets.

"Hierarchical wins over swarm in production almost every time. The supervisor anchors goal alignment; swarms drift without it."— Internal multi-agent retro, March 2026

03 — Quadrant 03Multi-agent competitive and adversarial.

Multiple agents in tension or critique relationship. Used for quality and safety improvement, not parallel work.

Pattern 5: Multi-agent debate. Two or more agents argue different positions; a separate judge or synthesizer agent draws the conclusion. Used for decisions where multiple perspectives matter or where adversarial stress-testing improves output.

When to use: high-stakes decisions where considering counter-positions matters; safety-critical evaluations; brainstorming where divergent thinking is desired.

Failure modes: agents converge prematurely; adversarial agents weaken on long debates; judge bias toward verbose arguments.

Mitigations: bound debate rounds; require structured arguments; rotate judge identity to reduce bias.

Pattern 6: Verifier-critic. A generator agent produces output; a critic agent scores or annotates the output against a rubric; the generator revises based on critique. Standard pattern for catching hallucinations and policy violations in production.

When to use: outputs requiring high accuracy or policy compliance; safety filtering; quality regulation in content generation.

Failure modes: critic-generator collusion when both use same model; over-correction on edge cases; critique quality degrades on subjective dimensions.

Mitigations: use different models for generator and critic when possible; cap critique-revise cycles; pair with human review for low-confidence outcomes.

Use case

Code generation review

Verifier-critic with code-specific rubric (security, style, tests). Generator revises until critic passes or budget exhausted.

Verifier-critic

Use case

Strategic recommendation

Multi-agent debate with three perspectives (advocate, skeptic, synthesizer). Output is the synthesis, not any single perspective.

Multi-agent debate

Use case

Policy compliance check

Verifier-critic with policy-specific rubric. Generator output gated; critic flags violations for revision or human review.

Verifier-critic

Use case

Open-ended brainstorm

Multi-agent debate with divergent thinking and synthesizer. Better than single-agent for capturing breadth.

Multi-agent debate

04 — Quadrant 04Orchestration topologies.

How multi-agent systems are wired together. The topology governs coordination overhead, fault tolerance, and observability.

Pattern 7: Graph orchestration. Agents arranged as nodes in a directed graph with explicit edges for control flow. LangGraph and Microsoft Agent Framework implement this canonically. Strong observability; deterministic flow.

When to use: structured workflows with defined decision points; production systems requiring trace-level debugging; conditional logic across multiple agents.

Failure modes: graph complexity grows fast; edge cases not covered; cyclic execution requires careful termination.

Mitigations: bound graph node count; require terminal nodes; instrument every edge with metrics.

Pattern 8: Swarm / blackboard. Peer-agent topology with no supervisor. Agents post and read from a shared blackboard or message bus. OpenAI's swarmreference and AutoGen's group chat fall under this pattern.

When to use: exploratory tasks where the right decomposition isn't known up front; research and brainstorming; systems with naturally peer-equivalent agents.

Failure modes: goal drift without supervisor anchor; coordination dead-locks; debugging difficulty.

Mitigations: add a coordinator role even if not full supervisor; instrument the message bus heavily; bound agent budget to prevent runaway.

Topology selection

Graph for structured production workflows. Hierarchy (Pattern 4) for clear task decomposition. Swarm only for exploratory or research-mode systems where flow can't be pre-specified. Default to graph or hierarchy in production.

05 — Section 05Framework support matrix.

How the canonical patterns map across the major agent orchestration frameworks as of Q2 2026. For a side-by-side feature breakdown, see our comparison of agent orchestration platforms.

LangGraph. LangChain's graph orchestration framework. Strong on Pattern 7 (graph) by design. Supports ReAct, Reflexion, plan-and-execute via prebuilt patterns. Supervisor-worker via subgraphs.

AutoGen. Microsoft Research's multi-agent framework. Strong on Patterns 4 (supervisor-worker), 5 (debate), 8 (swarm via group chat). ReAct and Reflexion via single-agent mode.

CrewAI. Role-based multi-agent framework. Strong on Pattern 4 (supervisor-worker) via crew/task metaphor. Supports verifier-critic via task chains.

OpenAI Agents SDK. Native multi-agent primitives in the OpenAI SDK. Supports Patterns 1, 2, 4, 7 via Handoff and tool calling. Swarm via the swarm reference implementation.

Microsoft Agent Framework. .NET- and Python-native agent framework. Strong on graph orchestration; supports all canonical patterns.

Coverage

Production-grade choices

5frameworks

All five support the canonical patterns. Choose by team familiarity and ops integration, not by exclusive pattern support.

Q2 2026

Default

LangGraph + AutoGen

2leaders

LangGraph for graph-driven production workflows; AutoGen for multi-agent research and prototyping. Most production teams pick one of these.

Most-used

Vertical

CrewAI + OpenAI SDK

2specialists

CrewAI for role-based decomposition; OpenAI Agents SDK when staying in OpenAI ecosystem. Both production-ready.

Specialty

06 — Section 06Pattern selection guide.

How to pick the right pattern for a given workload. Decision-tree style: start with the simpler option; escalate only when measurement justifies.

Step 1: Start with ReAct. Build a single-agent ReAct baseline. Measure success rate, tool-call accuracy, latency, and cost on your evaluation set.

Step 2: Add Reflexion if failures repeat. If your eval shows the agent making the same mistake across multiple runs, add Reflexion. Latency increases by ~30%; quality typically improves 10-30% on the failure-mode subset.

Step 3: Add re-anchor if loops exceed 30 steps. For long-running agents, add re-anchor checkpoints to manage cache invalidation. Prevents the silent prefix-cache miss that can spike costs 5-10×.

Step 4: Add verifier-critic for high-stakes outputs. If output quality matters (legal, financial, code), add a verifier-critic loop. Doubles inference cost on the verified subset; cuts critical errors.

Step 5: Move to multi-agent only when single-agent caps out. Multi-agent adds coordination overhead. Justify with measured gain on a specific failure mode that single-agent can't address (parallel work, role specialization, perspective diversity).

Step 6: Choose hierarchy or graph for production multi-agent. Avoid swarm in production; use it in research mode only. Hierarchy when task decomposition is clear; graph when control flow is conditional and needs observability.

"Most teams adopt multi-agent before single-agent reaches its ceiling. Build the baseline; measure where it caps out; escalate from there."— Internal agent design retro, April 2026

07 — ConclusionEight patterns, sequenced by escalation.

The shape of agent architecture · April 2026

Start single. Add reflection. Escalate only when measurement says you must.

The eight canonical agent architecture patterns cover the production design space. Most agent systems are compositions: ReAct + Reflexion + re-anchor for the single-agent stack; supervisor-worker + verifier-critic for high-stakes multi-agent; graph orchestration for production-grade observability.

The escalation discipline matters more than the patterns themselves. Start single-agent; measure where it caps out; add the next pattern that addresses the measured failure mode. Don't lead with multi-agent — coordination overhead often dominates the quality gain when single-agent hasn't hit its ceiling.

Frameworks have converged on supporting the canonical patterns. Pick by team familiarity, ops integration, and observability needs — not by which framework supports a specific pattern uniquely. The patterns map across frameworks; rebuilding to switch frameworks is rarely justified once a system is in production.

Agent Architecture Patterns: 2026 taxonomy.

01 — Quadrant 01Single-agent patterns.

ReAct

Reflexion

02 — Quadrant 02Multi-agent collaborative.

03 — Quadrant 03Multi-agent competitive and adversarial.

Code generation review

Strategic recommendation

Policy compliance check

Open-ended brainstorm

04 — Quadrant 04Orchestration topologies.

05 — Section 05Framework support matrix.

Production-grade choices

LangGraph + AutoGen

CrewAI + OpenAI SDK

06 — Section 06Pattern selection guide.

07 — ConclusionEight patterns, sequenced by escalation.

Start single. Add reflection. Escalate only when measurement says you must.

Stop adding coordination overhead without measurement.

Agent architecture engagements

The architecture questions we get every week.

Continue exploring agent design references.

AI Workflow Orchestration Platforms: 2026 Comparison

OpenAI Encrypts Codex Agent Instructions: Audit Stakes

Agentic AI Glossary: 200 Essential Terms for 2026

MCP & Tool-Use Vocabulary: 2026 Reference Guide