Multi-Agent Orchestration Patterns: Pattern Language 2026
Multi-agent orchestration pattern language — producer, consumer, coordinator, critic, and judge archetypes with composition rules and failure-mode handling.
Core Archetypes
Composition Rules
Failure Modes
Approach
Key Takeaways
Christopher Alexander wrote A Pattern Language for buildings. Multi-agent systems need one too — and this is ours, five archetypes and the composition rules that keep them from deadlocking.
After eighteen months of shipping multi-agent workflows for agency clients — content pipelines, security audits, data extraction, customer support routing — a pattern has emerged. Every reliable system we have built or debugged reduces to five recurring roles and a small set of composition rules. The rest is implementation detail. This guide names those roles, lists the rules, enumerates the failure modes, and walks two worked examples so teams can stop reinventing orchestration and start shipping predictable systems.
Framework-agnostic: Archetypes and rules below map cleanly onto LangGraph, CrewAI, the OpenAI Agents SDK, and the Claude Agent SDK. For the framework matchup, see our agent framework comparison matrix.
Why Pattern Language for Agents
Pattern languages work because they give recurring solutions named handles. In building architecture, "courtyard," "alcove," and "entry transition" are not just descriptions — they are the vocabulary that lets architects have precise, efficient design conversations. In software, the Gang of Four design patterns did the same for object-oriented code. Multi-agent systems are at the same inflection point: the designs are emerging, the failure modes recur, and teams are burning weeks re-deriving solutions that other teams already found.
The value of a pattern language is not novelty — it is shared vocabulary. When one engineer says "we need a critic loop here" and another answers "bounded to three iterations, with a judge at the end," the conversation skips past ten minutes of clarification and lands on the right design. That compression compounds across a team and across a quarter of work. Without a language, every review debates the same tradeoffs and every new hire re-learns the same traps.
- Names recurring roles so designers can refer to them without re-describing the shape each time.
- Specifies composition rules so roles combine predictably rather than by trial and error.
- Enumerates failure modes so known traps have known mitigations rather than being rediscovered in production.
- Stays framework-agnostic so the vocabulary outlives any particular library choice.
- Grounds itself in worked examples so the abstractions have a concrete referent and new users can learn by analogy.
For the broader multi-agent context, see our multi-agent systems guide and the complementary parallel development guide, which focuses on agents collaborating on code.
Archetype 1: Producer
A producer takes an ambiguous input — a goal, a question, a user intent — and emits a set of well-formed work items for downstream consumers. The producer's job is decomposition. It turns "draft a cornerstone post on agent orchestration" into a list of sections, research tasks, and supporting assets. It turns "audit this codebase for security vulnerabilities" into a list of files to inspect and threat models to apply.
- Input: a goal expressed in natural language, often under-specified.
- Output: a list of discrete work items with enough structure for a consumer to execute without further clarification.
- Typical implementations: a research agent that generates a task list, a brief decomposer that turns a client deliverable into sections, a planner that emits a sequence of tool calls.
- Quality signal: consumers can execute every work item without asking follow-up questions.
Producer Design Rules
A good producer emits work items that are mutually exclusive where possible, collectively exhaustive for the stated goal, and sized for a single consumer invocation. Producers that emit overlapping items create redundant consumer work; producers that emit under-scoped items create consumer clarification loops. The sharpest test of producer quality is whether a downstream consumer can act on a work item in isolation, with no access to the original input.
Designing a multi-agent delivery pipeline? Most agency workflows benefit from explicit producer and consumer separation. Explore our AI Digital Transformation service to architect reliable agent systems for your team.
Archetype 2: Consumer
A consumer takes a single well-formed work item and executes it, returning a result. Consumers are the workhorses of the pattern language: writers drafting sections, implementers writing code, extractors pulling data from documents, classifiers labeling inputs. Consumers do not decompose work, do not critique work, do not decide whether to ship — they execute.
- Input: a single work item with a clear acceptance condition.
- Output: a completed artifact (a draft section, a code diff, an extracted field, a classification label).
- Typical implementations: a writer agent, an implementer agent with tool access, a code-modification agent, a form-filler.
- Quality signal: the output passes the acceptance condition stated in the work item without rework.
Consumer Idempotency and Fan-Out
Consumers should be idempotent on the work item — invoking the same consumer twice with the same input should produce equivalent output. This property makes retry logic simple and lets coordinators fan consumers out across multiple work items in parallel without coordination overhead. Consumers that mutate shared state or depend on invocation order break the composition rules and force coordinators to track ordering, which undermines the main performance benefit of a multi-agent pipeline.
Archetype 3: Coordinator
A coordinator routes work items between producers, consumers, critics, and judges. It manages sequencing, fan-out, fan-in, retry on failure, and termination. Coordinators are the plumbing of the pattern language — they do not generate work, execute work, or decide about work, but they make every other archetype composable and observable.
- Input: a workflow graph or dispatch policy.
- Output: correctly-ordered invocations of producers, consumers, critics, and judges, with retries and termination conditions.
- Typical implementations: a planner node in LangGraph, the crew process in CrewAI, a dispatcher routing messages to specialized agents, a top-level harness running an orchestrator loop.
- Quality signal: every invocation is accounted for in traces, no pipeline state is hidden from observability, and termination is deterministic.
Bounded Fan-Out and Termination
Coordinators should enforce a bounded fan-out — never dispatch more consumers than your token budget, rate limit, or downstream tool can absorb. They should also enforce a clear termination condition: a max iteration count, a judge decision, or an explicit timeout. Coordinators without termination conditions are the single largest source of runaway token spend in production multi-agent systems.
Archetype 4: Critic
A critic inspects an artifact and proposes improvements. Critics return suggestions, not decisions — they cannot halt the pipeline, cannot approve or reject, cannot ship or block. Their job is to make the artifact better within a bounded iteration budget. Reviewers, refiners, editors, and linters all fit the critic archetype.
- Input: a completed artifact from a consumer.
- Output: a list of suggested revisions, optionally with a confidence or severity score.
- Typical implementations: a reviewer agent, an editing pass, a refinement loop, a linter, a compliance checker that flags issues without blocking.
- Quality signal: suggestions lead to measured improvements on downstream evals rather than churn.
Bounded Critic Loops
Critics must run with a bounded iteration count, usually two or three. Unbounded critic loops create pathological rewrites where each pass undoes the previous pass's improvements. In practice, the first critic pass catches the obvious problems, the second catches the subtle ones, and a third rarely helps enough to justify its cost. When a judge disagrees with a critic's final suggestions, that is a signal worth tracking — it often reveals a misalignment between critic and judge prompts.
Archetype 5: Judge
A judge makes a binary go-or-no-go decision with the authority to halt or approve the pipeline. Judges are the explicit gates in the system: ship or block, accept or reject, escalate to human or auto-approve. Judges run once per decision point, not in a loop, and their output is deterministic in structure (approved, rejected, or escalate) even if the reasoning varies.
- Input: a final artifact, usually after one or more critic passes.
- Output: a decision with a short reason, typically approved, rejected, or escalate-to-human.
- Typical implementations: a gatekeeper agent, an approver at the end of a content pipeline, a compliance judge that decides whether to publish, a security reviewer that decides whether a finding is actionable.
- Quality signal: judge decisions align with human reviewers on a sampled audit set.
Why Critics and Judges Must Stay Separate
Merging critic and judge roles is the most common design mistake we see. The failure mode is predictable: when a single agent can both suggest revisions and block the pipeline, it enters an indefinite loop where new suggestions keep arriving and the gate never closes. Keeping the roles separate — critics run bounded suggestion loops, judges run once at the end — prevents the deadlock and produces dramatically more reliable systems.
Composition Rules: Stacking Patterns Cleanly
The archetypes compose according to twelve rules. Most are straightforward in isolation; the leverage comes from applying them consistently. Teams that adopt the full set report fewer production incidents and faster onboarding for new engineers.
| Rule | What it prevents |
|---|---|
| 1. Producers never consume their own output | Cycle formation at the decomposition step. |
| 2. Consumers are idempotent on their work item | Retry ambiguity and ordering dependencies. |
| 3. Coordinators enforce acyclicity across archetypes | Infinite loops between archetype nodes. |
| 4. Critic loops are bounded to two or three iterations | Pathological rewrite churn. |
| 5. Judges run once per decision point | Deadlock between gate and suggestion authority. |
| 6. Coordinator fan-out is bounded | Rate-limit storms and runaway token spend. |
| 7. Every invocation produces a span | Blind debugging of opaque pipelines. |
| 8. Every archetype has an eval set | End-to-end scoring that hides component regressions. |
| 9. Critics suggest; judges decide | Role merging and its predictable deadlocks. |
| 10. Work items are self-contained | Consumer clarification loops back to the producer. |
| 11. Shared state is explicit and typed | Silent drift between archetype invocations. |
| 12. Termination conditions are deterministic | Pipelines that never finish on edge inputs. |
Rule 7 and rule 8 deserve extra emphasis for production systems. For a deeper treatment of observability patterns, see our agent observability guide, which covers tracing, evaluation, and cost-per-archetype instrumentation in depth.
Failure Modes: Cycle Detection, Deadlock, Emergent Errors
Eight failure modes recur across production multi-agent systems. Naming them makes them shallow — a known failure with a known mitigation is a much smaller problem than an unnamed one.
Producer A emits work that triggers producer B, which emits work that routes back to A. Mitigation: coordinator tracks visited archetype nodes and refuses cycles (composition rule 3).
A single agent holds both suggestion and gate authority, so new suggestions keep arriving and the gate never closes. Mitigation: separate the roles (rule 9), bound the critic loop (rule 4).
Coordinator dispatches more parallel consumers than the token budget or downstream tool can absorb, causing rate-limit failures and retry storms. Mitigation: bounded fan-out (rule 6).
A small error in a producer's decomposition propagates through consumers and compounds, producing a wildly wrong final output. Mitigation: per-archetype evals (rule 8) that catch regressions at the source.
Shared state is modified implicitly across archetype invocations, so the same pipeline produces different outputs on re-runs. Mitigation: typed, explicit shared state (rule 11).
Consumers receive under-specified work items and loop back to the producer for clarification, burning tokens and latency. Mitigation: self-contained work items (rule 10).
Judge decisions drift away from human reviewer decisions over time, usually because the judge prompt was not re-baselined after model upgrades. Mitigation: audit-set sampling and periodic recalibration.
Pipeline has no deterministic termination condition and runs until a timeout or budget exhaustion on hard inputs. Mitigation: enforce explicit termination (rule 12).
Pattern-level debugging beats agent-level debugging. When a multi-agent pipeline misbehaves, first ask which archetype is failing and which composition rule is being violated. That frame usually points to the fix in minutes rather than hours.
Worked Example: Content Agency Pipeline
A content agency uses a four-archetype pipeline to turn a client brief into a published post. The shape is producer, consumer, critic, judge, wired together by a coordinator. Each archetype has a clear input, output, and acceptance condition, which makes the pipeline easy to reason about, evaluate, and debug.
Producer: Brief Decomposer
Takes a client brief ("a 2000-word cornerstone post on retail analytics trends for Q3") and emits a structured outline: six sections, three supporting research tasks, a target keyword list, and a draft title. Output is a JSON work-item list the consumer can execute without further clarification.
Consumer: Section Writer
Takes one outline section at a time, writes the prose, inserts the target keywords naturally, and emits a markdown block. Fans out across sections in parallel (bounded by the coordinator's fan-out rule), producing the full draft in roughly a fifth of the time of a sequential run.
Critic: Editor Loop
Takes the assembled draft and proposes revisions for style, accuracy, keyword density, and readability. Runs at most twice — the first pass catches structural issues, the second catches polish. A third pass has historically produced churn rather than improvement.
Judge: Publish Approval
Takes the post-edit draft and emits a binary decision: approve for publication, reject with reason, or escalate to a human editor. Runs once. Approval rate is sampled against human reviewers monthly to catch judge drift.
For agencies building similar pipelines, our Content Marketing service applies this exact pattern language to client deliverables at scale. The key implementation detail is the coordinator — bound fan-out to your model rate limit, track every archetype invocation as a span, and enforce deterministic termination. For deeper implementation patterns on the SDK side, see our Claude Agent SDK production patterns guide.
Worked Example: Security Audit Pipeline
A security audit pipeline applies the same pattern language to a radically different domain. The archetypes are identical in shape but specialized in purpose: producer decomposes an audit scope, consumers inspect individual files or threat vectors, critics refine findings, judges decide whether findings are actionable.
Producer: Audit Scoper
Takes the audit scope ("audit this Next.js codebase for OWASP Top 10 vulnerabilities") and emits a work-item list: files to inspect, threat models to apply, dependency vulnerabilities to verify, configuration files to check. Each item has a defined acceptance condition.
Consumer: Vulnerability Inspector
Takes one file or threat-model pairing and emits a finding (or confirms none). Fans out across files in parallel, with coordinator-enforced bounded concurrency so the inspection tool chain does not exceed rate limits on external vulnerability databases.
Critic: False-Positive Refiner
Takes the full findings list and proposes revisions: flag duplicates, downgrade suspected false positives, add cross-references between related findings. Runs at most twice. The refiner does not decide whether a finding ships — that is the judge's job.
Judge: Actionability Gate
Takes the refined findings and emits a decision per finding: actionable (include in the client report), informational (include in appendix), or discard (likely false positive). Judge decisions sample against a senior security engineer's adjudications monthly to calibrate.
The same shape — producer, consumer, critic, judge, coordinator — handles a domain with almost no overlap with content production. That portability is the point of the pattern language. Enterprise rollouts of multi-agent systems can further standardize on shared archetype libraries; for a reference architecture, see our enterprise agent platform reference architecture. For CRM-integrated variants of this pipeline shape, our CRM Automation service applies the same archetypes to lead routing, enrichment, and qualification flows.
Conclusion
Multi-agent systems are still in the era where every team solves the same problems from first principles. A pattern language short circuits that cycle: five archetypes, twelve composition rules, eight failure modes. The language is not a prescription for how every system should look — it is a shared vocabulary that makes design conversations precise and production incidents debuggable.
The two worked examples — content production and security auditing — show the language is portable across domains that have almost nothing in common at the surface. That portability is the signal that the abstraction is right. Whether your team is building with LangGraph, CrewAI, the OpenAI Agents SDK, or the Claude Agent SDK, the archetypes map cleanly and the rules hold. Naming is half the battle.
Build Reliable Multi-Agent Systems
Whether you are architecting a new multi-agent pipeline or stabilizing an existing one, a shared pattern language makes the work measurably faster and safer. We help teams design, build, and operate agent systems that ship.
Frequently Asked Questions
Related Guides
Continue exploring multi-agent orchestration and agent frameworks