This post anchors to the open-source agent framework landscape as of May 20, 2026 — five canonical frameworks that have already shifted under teams' feet: Swarm archived, AutoGen in maintenance mode, LangGraph 1.0 GA, CrewAI 1.0 GA, Smolagents holding the sub-1,000-LOC code-first lane. Choosing among them by star count is the wrong axis. The right axis is mental model.

Two of the five are already in managed transition: OpenAI archived the Swarm repo in March 2025 and redirected users to the production Agents SDK; Microsoft moved AutoGen to maintenance mode in early 2026 and shipped the Microsoft Agent Framework (MAF) 1.0 as the greenfield successor in April 2026. Teams that evaluate frameworks from 2024 blog posts are still being sent to both deprecated paths. That mis-routing has real consequences — maintenance-mode frameworks receive bug fixes only, and archived repos receive nothing.

This guide covers each framework's mental model, core API, and production posture; the 18-row × 5-column comparison matrix that consolidates every decision dimension in one place; a 6-question decision tree; and the case for ignoring star counts entirely in favor of verified production case studies. If you want to understand the patterns these frameworks run — fan-out, pipeline, supervisor, debate, swarm — the companion post on 5 multi-agent orchestration patterns covers that orthogonal axis.

Key takeaways

01
Pick by mental model, not by star count.LangGraph = graph (StateGraph + checkpointers). AutoGen = actor (async message-passing). CrewAI = role (Agent + Task + Process). Agents SDK = handoff (delegation IS the primitive). Smolagents = code-first script (CodeAgent writes Python). Each mental model maps to a different class of workflow. Matching model to workflow prevents painful rewrites.
02
Two frameworks are already in managed transition.OpenAI Swarm was archived in March 2025 — the Swarm README now redirects to the Agents SDK. AutoGen entered maintenance mode in early 2026; Microsoft Agent Framework 1.0 reached GA in April 2026 as the greenfield successor. New builds on either path should use the successor, not the archived or maintenance-mode framework.
03
LangGraph trails on stars but leads on production case studies.AutoGen holds the star lead (~50–55k) from its V0.2 era. LangGraph (~32.7k) has vendor-confirmed production deployments at Klarna (85M active users, 80% resolution-time reduction), LinkedIn, Uber (~21,000 developer hours saved), and Replit. Stars are a lagging indicator of hype; production case studies are a leading indicator of engineering confidence.
04
Smolagents wins on auditability — nothing else does ~1,000 LOC.HuggingFace built Smolagents around a single constraint: keep the core logic under 1,000 lines so any engineer can read and audit it in a sitting. The CodeAgent executes Python directly, which means arbitrary code runs — sandbox via Modal, E2B, Blaxel, or Docker. That trade-off (powerful but execution-risky) is the defining characteristic of the code-first mental model.
05
CrewAI is the fastest to scaffold; LangGraph has the richest checkpointing.The CrewAI CLI scaffolder takes approximately 10 minutes from install to a working multi-agent flow — the fastest of the five. LangGraph's built-in Postgres and Redis checkpointers, plus time-travel debugging (replay any prior state, resume after crash or human approval), give it the richest production-grade state management of any open-source option as of May 2026.

01 — Framework SnapshotFive frameworks, five mental models.

Before diving into each framework, it helps to have the five mental models laid out side by side. Every ergonomic difference — scaffolding time, learning curve, production fit — follows predictably from the mental model. The section after this one covers each framework in depth; the matrix in Section 07 tabulates every dimension.

Graph

LangGraph

StateGraph + Checkpointers

Workflows are directed graphs with explicit state. Every edge is a transition, every node is a function or subgraph. Checkpointers (Postgres / Redis) enable time-travel debugging and human-in-the-loop approvals. Best fit: complex stateful workflows. GA: Oct 22, 2025.

langchain-ai/langgraph

Actor

AutoGen → MAF

Async actors + GroupChat

Agents are addressable actors exchanging typed messages; the runtime handles routing, lifecycle, and telemetry. In maintenance mode since early 2026 — Microsoft Agent Framework 1.0 (Apr 2026) is the greenfield successor for Azure / .NET stacks.

microsoft/autogen → MAF

Role

CrewAI

Agent + Task + Process

Each agent has a role, goal, backstory, and tool list. Tasks have expected outputs. The Crew runs a Process (Sequential or Hierarchical). CrewAI Flows adds @start / @listen / @router decorators for complex branching. Fastest scaffold: ~10 min. GA: Oct 26, 2025.

crewAIInc/crewAI

Handoff

Agents SDK

Agent + handoffs + guardrails

The multi-agent primitive IS the handoff — one agent delegates to another. Guardrails validate input and output. Sessions auto-manage conversation history. Sandbox Execution (Apr 2026) adds controlled file I/O, command execution, and state snapshotting. Swarm successor.

openai/openai-agents-python

Code-first

Smolagents

CodeAgent writes Python

Agents write Python code snippets as actions rather than calling JSON tool schemas. The core logic is ~1,000 lines — readable and auditable in a sitting. Model-agnostic via litellm. Sandbox via Modal, E2B, Blaxel, or Docker. HuggingFace ecosystem. Launched Dec 31, 2024.

huggingface/smolagents

02 — LangGraphThe graph mental model: StateGraph + checkpointers.

LangGraph 1.0 reached general availability on October 22, 2025, with zero breaking changes from v0.x — the existing API carries forward. The same day, LangChain 1.0 (Python and TypeScript) also went GA. The canonical 2026 entry point is create_agent (Python) or createAgent (TypeScript) with a middleware array exposing hooks before_agent, before_model, wrap_model_call, wrap_tool_call, after_model, and after_agent. For fully custom orchestration, StateGraph remains the lower-level primitive — define nodes, edges, and a typed state schema, then compile and run.

The defining production capability is checkpointing. LangGraph ships built-in checkpointers for Postgres and Redis. Any state can be replayed forward from any prior checkpoint — time-travel debugging in practice. When an agent crashes mid-workflow, resume from the last good checkpoint. When a workflow reaches a human-approval gate, suspend there and resume only after an asynchronous approval signal arrives. No other open-source framework in this comparison offers this depth of state management out of the box.

The production track record is the strongest of the five. According to LangChain's own case study post, Klarna's AI Assistant — powered by LangGraph and LangSmith — serves 85 million active users and reportedly reduced customer resolution time by 80%. LinkedIn runs a hierarchical recruiter agent and a SQL Bot. Uber's Developer Platform uses LangGraph for automated unit-test generation, reportedly saving ~21,000 developer hours. Replit built a human-in-the-loop multi-agent system on it. These are vendor-confirmed; treat them as directional, not audited.

The trade-off is the graph mental tax. Building a 3-step prototype in LangGraph feels like overkill — the StateGraph definition, conditional edges, and checkpointer setup add ceremony that CrewAI or the Agents SDK don't require at that scale. For workflows that need explicit state and production-grade recovery, LangGraph pays for itself within the first production incident. For throwaway prototypes, reach for something else. See the LangChain vs LangGraph comparison for the detailed API difference between the two.

LangGraph 1.0 GA — Oct 22, 2025

LangGraph 1.0 and LangChain 1.0 (Python + TypeScript) both went GA on October 22, 2025 with zero breaking changes from v0.x. The legacy createReactAgent prebuilt still works but is superseded by create_agent / createAgent with a middleware array. AgentExecutor remains in maintenance until December 2026. Source: LangChain Changelog.

03 — AutoGenActor model, maintenance mode, and the MAF successor.

AutoGen v0.4 was a major architectural rewrite that reorganized the codebase into three layered packages: autogen-core (the asynchronous actor runtime), autogen-agentchat (the multi-agent conversation primitives), and autogen-ext (integrations). The actor model means agents are addressable entities that exchange typed messages; the runtime handles routing, lifecycle, and OpenTelemetry span emission. GroupChat is the canonical multi-agent primitive — manager-routed or fixed-turn-order message passing across specialized agents.

The critical fact for 2026 builds: AutoGen is in maintenance mode as of early 2026. It will receive bug fixes and security patches; it will not receive new features. The successor for new builds is the Microsoft Agent Framework (MAF) 1.0, which reached GA in April 2026 and merges AutoGen with Semantic Kernel under a single production-grade umbrella. For .NET shops, MAF is the only current path with active feature development. For teams already running AutoGen v0.4, a migration guide is live on Microsoft Learn; our Microsoft Agent Framework 1.0 guide walks the successor architecture in detail.

AutoGen's star count (~50–55k, verify at publish day) leads the field — an artifact of the large V0.2 era community rather than a signal of current momentum. That gap between star count and maintenance posture is the clearest example of why star-based framework selection fails.

AutoGen maintenance mode — verified May 2026

AutoGen is now in maintenance mode. It will not receive new features or enhancements and is community managed going forward. New users should start with Microsoft Agent Framework. Source: VentureBeat + Microsoft Learn migration guide. Exact wording: verify in the official AutoGen README at publish day. See also the Microsoft Agent governance toolkit guide for security patterns that carry from AutoGen into MAF.

04 — CrewAIRole mental model: ~10-minute scaffold, three Process types.

CrewAI 1.0 OSS went GA on October 26, 2025. The mental model is role-based: each Agent has a role, goal, backstory, and tool list. Each Task has a description, expected output, and an assigned agent. A Crew collects agents and tasks, then runs a Process. Two processes are available in production: Sequential (tasks run in order) and Hierarchical (a manager LLM routes tasks to specialist agents). A Consensual process is planned but not yet implemented as of May 2026.

For more complex orchestration, CrewAI Flows adds a Python class-based primitive with @start, @listen, and @router decorators that orchestrate arbitrary mixtures of crews, agent calls, and regular Python code. Flows is the recommended primitive when the workflow outgrows Sequential or Hierarchical.

The fastest scaffold time of the five — approximately 10 minutes from pip install crewai to a working multi-agent flow via the CLI scaffolder. That speed makes CrewAI the natural choice for teams running parallel proof-of-concepts or for workflows that map cleanly to a researcher → writer → editor → reviewer role chain. For the detailed comparison between CrewAI and LangGraph architecture, see the deep-dive on LangGraph vs CrewAI.

As of v1.10.x, CrewAI supports MCP servers as first-class tools and A2A (Agent-to-Agent) task execution. CrewAI Inc. claims their framework powers approximately 1.4 billion agentic automations per month and is used by 60% of the Fortune 500. These figures are vendor-stated and have not been independently audited — treat them as directional marketing.

05 — Swarm → Agents SDKHandoff mental model: Swarm archived, Agents SDK in production.

OpenAI Swarm is archived. It was always an educational framework — described in its own README as "exploring ergonomic, lightweight multi-agent orchestration" — and OpenAI stopped updating it when the production Agents SDK launched in March 2025. Bug reports and PRs against the openai/swarm repo are not triaged. The Swarm README explicitly redirects users to the Agents SDK.

The OpenAI Agents SDK is the production successor. The Python package (openai-agents) released v0.17.3 on May 19, 2026 — the day before this post's publish date. The TypeScript counterpart (@openai/agents) released v0.8.3 on April 6, 2026. The SDK is provider-agnostic — it supports OpenAI Responses and Chat Completions APIs alongside 100+ other LLMs.

The mental model is handoff-centric. The multi-agent primitive is a handoff: one agent delegates to another. Around that core sit guardrails (input and output validation), sessions (automatic conversation history), and tracing (debug and observability). In April 2026, OpenAI added two major capabilities: the Model-Native Harness (a control plane owning the agent loop, model calls, tool routing, handoffs, approvals, tracing, and run state) and Sandbox Execution (a controlled environment for file I/O, command execution, package install, port exposure, and state snapshotting). Sandbox providers include Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel — making the Agents SDK the strongest fit for TypeScript-first Next.js deployments of the five. Source: OpenAI, "The next evolution of the Agents SDK".

OpenAI Swarm is archived — use the Agents SDK

The openai/swarm repo has been archived since March 2025. Bug reports are not triaged. The production successor is the OpenAI Agents SDK (Python v0.17.3 as of May 19, 2026 · TypeScript v0.8.3 as of April 6, 2026). Any tutorial or blog post from 2024 that recommends building on Swarm is directing you to a dead end.

06 — SmolagentsCode-first script: ~1,000 LOC of auditable core.

HuggingFace launched Smolagents on December 31, 2024 with an unusual pitch: "a barebones library for agents that think in code." The core logic is approximately 1,000 lines — deliberately kept small so any engineer can read and audit it in a single sitting. That constraint is the differentiator, not an oversight.

The mental model is code-first. The primary agent type, CodeAgent, writes Python code snippets as its actions rather than calling JSON tool schemas. This makes tool composition natural — agents can nest function calls, use loops, and apply conditionals in the same way a developer would. The alternative, ToolCallingAgent, uses the classic JSON tool-call format for teams that prefer the more constrained interface. Both extend MultiStepAgent, the ReAct-loop base class.

Code-first execution means arbitrary Python runs — which requires a sandbox in any production or multi-tenant context. Supported sandbox providers: Modal, E2B, Blaxel, and Docker. Without a sandbox, CodeAgent executes in the host process, which is acceptable for local development and fully trusted environments only.

Smolagents is model-agnostic — any LLM accessible via litellm works, as do local models via transformers or ollama. MCP tool support is available via MCPClient; MCP resources support is an open feature request (issue #1460) and not yet shipped as of May 2026. The framework holds approximately 26.5k GitHub stars (verify at publish day) under an Apache 2.0 license (verify in the LICENSE file). Latest stable: v1.24.0, released January 16, 2026.

Worst fit: production multi-tenant systems at scale without adding your own checkpointing layer. Smolagents deliberately keeps state management out of scope — the 1,000-LOC ethos means the framework does not try to solve every problem. For HuggingFace-ecosystem teams that need auditability above all else, it remains the only framework in this comparison that makes auditing feasible at all.

07 — Comparison Matrix18 dimensions, five frameworks — one table.

The table below is the canonical single-page reference for choosing among the five frameworks. Star counts and version numbers are dated May 20, 2026 and flagged for re-verification at your publish or evaluation date — both move daily. All other rows draw from primary sources cited throughout this post.

GitHub repo

Canonical source

LangGraph: langchain-ai/langgraph · AutoGen: microsoft/autogen · CrewAI: crewAIInc/crewAI · Agents SDK: openai/openai-agents-python (Swarm: openai/swarm — archived) · Smolagents: huggingface/smolagents

All five on GitHub

Stars (May 2026) ⚠️

Verify on publish day

AutoGen ~50–55k · CrewAI ~44.5k · LangGraph ~32.7k · Agents SDK Python ~26.5k · Smolagents ~26.5k. AutoGen leads because of V0.2 era hype, not 2026 momentum. Star counts move daily — flag before quoting.

Verify on GitHub

License

Open-source terms

LangGraph: MIT · AutoGen: Code MIT + Docs CC-BY 4.0 · CrewAI: MIT · Agents SDK: MIT · Smolagents: Apache 2.0 (verify LICENSE file at publish day). All five allow commercial use.

All permissive

Latest stable

Version snapshot — May 2026 ⚠️

LangGraph: 1.0 GA (Oct 22 2025) · AutoGen: v0.7.x · CrewAI: v1.10.x (1.0 GA Oct 26 2025) · Agents SDK Python: v0.17.3 (May 19 2026) · Agents SDK JS: v0.8.3 (Apr 6 2026) · Smolagents: v1.24.0 (Jan 16 2026).

Verify before build

Maintenance state

Active vs transition

LangGraph: active development. AutoGen: maintenance mode (early 2026) — successor MAF 1.0 (Apr 2026). CrewAI: active development. Agents SDK: active development. Smolagents: active development.

Two in transition

Language(s)

Primary + port

LangGraph: Python + TypeScript (LangGraph.js v1.1.4). AutoGen: Python (.NET via MAF). CrewAI: Python only. Agents SDK: Python (primary) + TypeScript (@openai/agents v0.8.3, lags Python cadence). Smolagents: Python only.

Only 2 have TS

Mental model

The hidden axis

LangGraph: Graph (directed StateGraph + explicit state). AutoGen: Actor (async event-driven actors + typed messages). CrewAI: Role (Agent + Task + Process). Agents SDK: Handoff (delegation IS the primitive). Smolagents: Code-first script (CodeAgent writes Python).

Match to workflow

Core entry API

2026 canonical entry points

LangGraph: create_agent / createAgent + middleware array; StateGraph for custom. AutoGen: AgentChat, Core runtime, GroupChat. CrewAI: Agent + Task + Crew(process=…) + Flow decorators. Agents SDK: Agent(instructions, tools, handoffs, guardrails) + Runner.run(). Smolagents: CodeAgent / ToolCallingAgent.

See framework docs

Built-in supervisor

Manager routing native?

LangGraph: yes (Supervisor pattern in create_agent middleware + StateGraph). AutoGen: yes (GroupChat manager mode). CrewAI: yes (Hierarchical Process with manager LLM). Agents SDK: yes (handoffs ARE the supervisor). Smolagents: patternable only (one agent invokes another via tools).

All 4 except Smol

State checkpointing

Resume after crash?

LangGraph: native (Postgres + Redis checkpointers; time-travel debug; human-in-loop). AutoGen: runtime-level session state. CrewAI: build-your-own (CrewAI Flows + external state). Agents SDK: session-level auto conversation history only. Smolagents: out of scope by design.

LangGraph only — native

MCP support

Model Context Protocol

LangGraph: mature first-class integration. AutoGen: via MAF / community adapters. CrewAI: native (v1.10.x + A2A v1.8.0). Agents SDK: native (hosted tools + MCP servers). Smolagents: MCP tools yes (MCPClient); MCP resources pending (issue #1460). MCP spec: 2025-11-25.

4 of 5 native

TypeScript port

Frontend / Next.js teams

LangGraph: langgraph.js (v1.1.4, stable). AutoGen: .NET only via MAF. CrewAI: Python only. Agents SDK: @openai/agents (v0.8.3, lags Python). Smolagents: Python only.

LangGraph.js or Agents SDK JS

Vercel / Next.js fit

Edge + serverless context

LangGraph: via langgraph.js + AI SDK. AutoGen: not natively friendly. CrewAI: Python backend behind a Next.js API route. Agents SDK: Sandbox built-in for Vercel; JS port maps cleanly to AI SDK. Smolagents: Python only — Next.js via API route only.

Agents SDK wins

Production case studies

Verified 2026 deployments

LangGraph: Klarna (85M users), LinkedIn (recruiter+SQL), Uber (~21k dev-hrs saved), Replit — vendor-confirmed. AutoGen: Microsoft internal (historical). CrewAI: 1.4B automations / 60% F500 — vendor-stated only, not independently audited. Agents SDK: OpenAI internal + sandbox ecosystem. Smolagents: AWS multi-model healthcare (AWS-published).

LangGraph leads verified

Tutorial-to-first-agent

Time to working prototype

CrewAI: ~10 min (CLI scaffolder — fastest). Agents SDK: ~15 min (handoffs example). Smolagents: ~15 min (CodeAgent + LiteLLM). LangGraph: ~30 min (LangChain Academy or Python quickstart). AutoGen: ~30 min (AgentChat quickstart).

CrewAI fastest

Learning curve / surface area

Cognitive overhead

Low: CrewAI (role abstraction is intuitive), Agents SDK (handoff = familiar), Smolagents (1,000 lines to read). Medium: LangGraph (graph mental tax; StateGraph definition adds ceremony). Medium: AutoGen (actor model learning curve).

CrewAI / Agents SDK easiest

Best fit

Ideal workflow shape

LangGraph: complex stateful workflows; multi-step graph with checkpoints + human-in-loop. AutoGen → MAF: distributed actor systems; .NET / Azure shops. CrewAI: role-defined pipelines (researcher → writer → editor); fastest scaffold. Agents SDK: OpenAI / multi-provider shops; sandboxed code execution; TypeScript-friendly. Smolagents: sub-1000-LOC auditability; code-first agents; HuggingFace ecosystem.

Depends on workflow

Worst fit

Avoid when…

LangGraph: throwaway 3-step prototype (overkill). AutoGen: new greenfield 2026 builds (use MAF instead). CrewAI: graph workflows with checkpointing (Flows is improving but not graph-native). Agents SDK: long-horizon stateful workflows (handoffs are stateless by default). Smolagents: production multi-tenant at scale without adding your own checkpointing.

Know your anti-fit

The matrix above is the asset that doesn't exist elsewhere in consolidated form. Most published comparisons either cover two or three frameworks, drop half the dimensions, or miss the Swarm → Agents SDK and AutoGen → MAF transition rows entirely. The 18-row format is designed to be the single-tab reference a team can share during a framework-selection decision. Verify star counts and version numbers at your evaluation date — both change daily.

For the broader architectural patterns that run on top of any of these frameworks — fan-out, pipeline, supervisor, debate, swarm — the agent architecture patterns taxonomy covers that layer. For known failure modes when implementing any of these, see orchestration anti-patterns.

08 — Decision TreeSix questions, one framework.

Most teams don't need to evaluate all five frameworks. They need to answer six questions in order and arrive at one. The tree below follows real 2026 production constraints — TypeScript-first stacks, auditability requirements, workflow shape, checkpointing needs, cloud-stack alignment, and sandbox requirements. Work through each question; stop at the first "Yes."

TypeScript-first stack?

If your team ships TypeScript primarily, your framework options narrow to three: LangGraph.js (v1.1.4, mature), OpenAI Agents SDK JS (@openai/agents v0.8.3), or Mastra (TypeScript-native, not in this five-way comparison). All three have first-class TS support. Python-only frameworks (CrewAI, Smolagents) require a Python backend behind an API route.

LangGraph.js · Agents SDK JS · Mastra

Need to audit every line of agent code?

If auditability or security review requires that an engineer can read the entire agent framework core in one sitting, only Smolagents qualifies — ~1,000 lines of core logic, Apache 2.0. No other framework in this comparison makes that promise. Accept the trade-off: no built-in checkpointing; run in a sandbox (Modal, E2B, Blaxel, Docker) for any production or multi-tenant context.

Smolagents

Workflow is role-shaped (researcher / writer / editor)?

If your workflow maps naturally to a team of role-based specialists — researcher feeds writer feeds editor feeds reviewer — CrewAI's mental model fits without adaptation. The role / goal / backstory primitives and the ~10-minute CLI scaffolder make this the lowest-friction path for role-shaped pipelines. Escalate to CrewAI Flows if branching logic gets complex.

CrewAI

Need explicit state graph + checkpointing + human-in-loop?

If your workflow requires resuming after crash, time-travel debugging, or asynchronous human approval gates — LangGraph is the only open-source option with native Postgres + Redis checkpointers and a stable production track record on all three. The graph mental tax is real; it pays for itself in the first production incident where you need to replay state.

LangGraph

New build on Azure / .NET / Microsoft stack?

If you are starting a new build on an Azure or Microsoft stack, do not start with AutoGen — it is in maintenance mode as of early 2026. Start with Microsoft Agent Framework 1.0 (GA April 2026), which merges AutoGen + Semantic Kernel under active feature development. For existing AutoGen codebases, the migration guide is at learn.microsoft.com.

Microsoft Agent Framework 1.0 — not AutoGen

Multi-agent handoffs + sandbox execution + multi-provider?

If none of the above apply — your stack is Python or TypeScript, you need a lightweight handoff-based multi-agent system with sandbox execution and you want to route across multiple LLM providers — the OpenAI Agents SDK (Swarm successor) is the production path. Python v0.17.3 (May 19 2026). Vercel Sandbox built-in. Provider-agnostic.

OpenAI Agents SDK

09 — Vanity MetricStar counts are decoy data.

GitHub star counts are the most-cited and least-useful signal in framework selection. The chart below shows approximate star counts as of May 2026 (verify at publish day — all move daily). Note the inversion: AutoGen leads on stars by a wide margin yet is in maintenance mode. LangGraph trails significantly yet has the most independently-confirmable production deployments of any open-source agent framework.

GitHub stars — May 2026 snapshot ⚠️ verify at publish day

Source: github.com — retrieved May 24, 2026. Star counts move daily.

AutoGen⚠️ Maintenance mode — successor: Microsoft Agent Framework 1.0

~50–55k

CrewAIActive development · 1.0 GA Oct 26, 2025

~44.5k

LangGraphMost production case studies of the five · 1.0 GA Oct 22, 2025

~32.7k

Agents SDK (Python)Swarm successor · v0.17.3 May 19, 2026 · JS port: ~2.6k

~26.5k

Smolagents~1,000 LOC core · Apache 2.0 · v1.24.0 Jan 16, 2026

~26.5k

The star-count inversion is worth interpreting directly. AutoGen accumulated its star count during the 2023–2024 period when it was the most-discussed multi-agent framework in AI research circles. That popularity predates its maintenance-mode transition by two years. Stars from that era persist; feature development doesn't. A framework selected by star count in May 2026 will be built on a framework that has already entered end-of-life for features.

The more meaningful signal is production case study quality. LangGraph has four vendor-confirmed deployments with named companies and quantified outcomes (Klarna 85M users / 80% resolution reduction, Uber ~21K dev-hours). Those are testable claims backed by specific engineering teams. CrewAI's "1.4B automations / 60% F500" is vendor-stated marketing without third-party corroboration — informative for momentum, not for technical confidence. Smolagents has one independent cloud-vendor-published case study (AWS). Agents SDK and AutoGen have internal OpenAI and Microsoft footprints that are real but harder to evaluate externally.

The forward projection is clearer than the current snapshot: teams adopting LangGraph 1.0 or the Agents SDK today are on active development paths with growing ecosystems. Teams adopting AutoGen for new builds face a migration to MAF within the next 12–18 months anyway. The star-count shortcut actively leads teams into that migration debt.

The framework with the most stars is in maintenance mode. The framework with the most production case studies has the fewest stars of the mature options. That inversion is the entire argument against star-count selection.Digital Applied synthesis, May 20, 2026

10 — Hidden AxisMental model is what you're really choosing.

Every ergonomic friction point in agent framework adoption traces back to a mismatch between the team's mental model of the workflow and the framework's native mental model. A team that thinks of their agents as a cast of specialized roles will find CrewAI frictionless and LangGraph foreign — even though LangGraph is technically capable of the same workflow. A team that thinks in directed graphs will find LangGraph natural and CrewAI's role-based abstractions limiting when they need a conditional edge.

The five mental models in this comparison are not arbitrary style choices. Each reflects a different philosophy about what the core agent primitive should be:

Graph (LangGraph): The workflow IS the graph. State is explicit. Every transition is an edge. Complexity is managed by making structure visible.
Actor (AutoGen → MAF): Agents ARE actors. They communicate by message-passing. The runtime handles the rest. Complexity is managed by isolating agents behind typed message boundaries.
Role (CrewAI): Agents ARE people on a team. Give them a job description and they collaborate. Complexity is managed by organizational analogy.
Handoff (Agents SDK): The multi-agent primitive IS the handoff. One agent passes control to another. Complexity is managed by composing simple delegation chains.
Code-first script (Smolagents): The agent IS a Python programmer. Actions are code. Complexity is managed by the full power of the language — loops, conditionals, function composition — with the constraint of a 1,000-line core that never hides what it does.

The practical implication: before running a framework benchmark or reviewing a star-count table, ask which mental model your workflow already follows. If the answer is clear — role-based pipelines with defined specialists — the framework selection is nearly determined. If the answer is ambiguous — a mix of stateful graph steps and role-based delegation — that ambiguity is the signal that a hybrid architecture is worth evaluating, and the Claude Agent SDK production patterns guide covers cross-framework production discipline that applies regardless of which orchestration layer you choose.

For TypeScript teams evaluating this landscape, the Vercel AI SDK 6 deep dive covers the stop-condition and tool-call API that bridges any of these frameworks into a Next.js deployment. The MCP vs LangChain vs CrewAI comparison covers the protocol-versus-framework distinction, which becomes relevant once MCP tool support is a requirement.

Five frameworks

Frameworks compared

LangGraph, AutoGen, CrewAI, OpenAI Agents SDK (Swarm successor), and Smolagents. Each with a distinct mental model, maintenance posture, and production track record.

May 2026 snapshot

LangGraph 1.0 GA

LangGraph + LangChain 1.0

Oct 22 2025

Zero breaking changes from v0.x. Built-in Postgres + Redis checkpointers, time-travel debug, and human-in-loop gates. Klarna (85M users), Uber (~21K dev-hrs), LinkedIn, Replit all on record.

Most production case studies

AutoGen → MAF

Microsoft Agent Framework 1.0 GA

Apr 2026

AutoGen entered maintenance mode early 2026. MAF 1.0 reached GA in April 2026, merging AutoGen + Semantic Kernel. New Azure / .NET builds should start with MAF, not AutoGen.

Successor path — not AutoGen

Smolagents core

Auditable by design

~1,000 LOC

HuggingFace's explicit constraint: keep the core logic under 1,000 lines. CodeAgent writes Python as actions. Run in Modal, E2B, Blaxel, or Docker sandbox for production or multi-tenant workloads.

Apache 2.0 · v1.24.0

The framework landscape — May 2026

Mental model first. Stars last. Two frameworks already in transition.

The open-source agent framework landscape in May 2026 is not a stable five-way race. Two of the five are in managed transition — Swarm to the Agents SDK, AutoGen to Microsoft Agent Framework. Teams still referencing 2024 comparison posts are being directed to archived repos and maintenance-mode codebases. The mental model axis — graph, actor, role, handoff, code-first script — is the selection criterion that actually predicts ergonomic fit and long-term maintenance cost. Star counts predict neither.

LangGraph 1.0 holds the strongest production case study record of the five and the richest built-in checkpointing. The Agents SDK is the active production successor to Swarm, with the strongest TypeScript and Vercel story. CrewAI delivers the fastest scaffold time and the most intuitive role-based mental model. Smolagents is the only option with a sub-1,000-LOC auditable core. MAF 1.0 is the path for Microsoft and Azure stacks — not AutoGen, which is now feature-frozen. For the multi-agent orchestration patterns that run on top of any of these — fan-out, pipeline, supervisor, debate — see the companion post on 5 multi-agent orchestration patterns. For the Microsoft successor in depth, see the Microsoft Agent Framework 1.0 guide. For the LangGraph architecture deep-dive, see LangGraph vs CrewAI head-to-head. If you are choosing a framework for a production AI transformation initiative, our AI transformation service starts with exactly this kind of architecture selection — matched to your workflow shape, not to the current leaderboard.

The broader signal from this landscape is about framework lifecycle management, not just selection. Open-source agent frameworks move fast. The framework that was the recommended starting point in 2024 (AutoGen) may not be the recommended path in 2026 (MAF). Building with a modular architecture — keeping business logic separated from orchestration layer — preserves the option to migrate when the framework landscape shifts again, which it will.

Five Agent Frameworks, one Decision Matrix

01 — Framework SnapshotFive frameworks, five mental models.

LangGraph

AutoGen → MAF

CrewAI

Agents SDK

Smolagents

02 — LangGraphThe graph mental model: StateGraph + checkpointers.

03 — AutoGenActor model, maintenance mode, and the MAF successor.

04 — CrewAIRole mental model: ~10-minute scaffold, three Process types.

05 — Swarm → Agents SDKHandoff mental model: Swarm archived, Agents SDK in production.

06 — SmolagentsCode-first script: ~1,000 LOC of auditable core.

07 — Comparison Matrix18 dimensions, five frameworks — one table.

Canonical source

Verify on publish day

Open-source terms

Version snapshot — May 2026 ⚠️

Active vs transition

Primary + port

The hidden axis

2026 canonical entry points

Manager routing native?

Resume after crash?

Model Context Protocol

Frontend / Next.js teams

Edge + serverless context

Verified 2026 deployments

Time to working prototype

Cognitive overhead

Ideal workflow shape

Avoid when…

08 — Decision TreeSix questions, one framework.

TypeScript-first stack?

Need to audit every line of agent code?

Workflow is role-shaped (researcher / writer / editor)?

Need explicit state graph + checkpointing + human-in-loop?

New build on Azure / .NET / Microsoft stack?

Multi-agent handoffs + sandbox execution + multi-provider?

09 — Vanity MetricStar counts are decoy data.

GitHub stars — May 2026 snapshot ⚠️ verify at publish day

10 — Hidden AxisMental model is what you're really choosing.

Frameworks compared

LangGraph + LangChain 1.0

Microsoft Agent Framework 1.0 GA

Auditable by design

Mental model first. Stars last. Two frameworks already in transition.

Five frameworks, one right choice for your workflow.

Agent framework engagements

The questions teams ask every week.

Continue exploring agent frameworks.

Multi-Agent Orchestration: 5 Patterns That Work in 2026

Agentic Orchestration: LangGraph vs CrewAI vs Mastra

Microsoft Agent Framework 1.0: .NET and Python 2026

OpenAI Agents SDK vs LangGraph vs CrewAI: 2026 Matrix