DevelopmentFramework12 min readPublished May 17, 2026

Four enterprise patterns · spec-correct OAuth 2.1 · token passthrough forbidden

MCP Server Patterns for Enterprise AI Agents

Spec 2025-11-25 mandates OAuth 2.1 + PKCE S256, RFC 9728 Protected Resource Metadata, and RFC 8707 Resource Indicators. Token passthrough is explicitly forbidden. Four enterprise topologies cover the deployment surface — from single-tenant stdio to federated gateway across a 200-server estate.

DA
Digital Applied Team
Senior engineers · Published May 17, 2026
PublishedMay 17, 2026
Read time12 min
Sources11
MCP spec version
2025-11-25
current release
Mandatory PKCE method
S256
when technically capable
Token passthrough
0
explicitly forbidden
per spec
Enterprise topologies
4
covered in this guide

The Model Context Protocol gives AI agents a standardized interface to tools, prompts, and resources — but spec version 2025-11-25 introduced a mandatory authorization framework that fundamentally changes how enterprise teams should design their MCP server deployments. OAuth 2.1 with PKCE S256, RFC 9728 Protected Resource Metadata, RFC 8707 Resource Indicators, and an explicit ban on token passthrough together define a security posture that most tutorial coverage still glosses over.

The stakes are material. MCP servers sit at the intersection of language models and production data surfaces — databases, APIs, internal tooling. A misdesigned auth layer doesn't just introduce a compliance gap; it creates a confused-deputy attack surface where a malicious prompt can redirect tool calls and exfiltrate tokens through a legitimately-authorized agent. According to the spec's security section, the protocol intentionally grants "powerful capabilities through arbitrary data access and code execution paths" — and expects implementors to handle the security weight that comes with that.

This guide covers the 2025-11-25 spec's transport and auth requirements in engineering-grade detail, then maps four enterprise deployment topologies — single-tenant, multi-tenant row-isolated, federated gateway, and edge-cached read-only — against the spec's constraints. Every pattern includes its auth assignment, caching posture, and the trade-off that rules it in or out for a given deployment.

Key takeaways
  1. 01
    The current MCP spec version is 2025-11-25.It defines exactly two standard transports: stdio and Streamable HTTP. The earlier HTTP+SSE transport from the 2024-11-05 revision is deprecated and should not be used for new servers. Backward compatibility is preserved via the MCP-Protocol-Version header.
  2. 02
    Authorization is optional, but when used, the spec is prescriptive.For remote servers on HTTP transports, the spec mandates OAuth 2.1 with PKCE S256, RFC 9728 Protected Resource Metadata for authorization-server discovery, RFC 8707 Resource Indicators on every token request, and RFC 7591 Dynamic Client Registration. stdio servers should read credentials from the environment.
  3. 03
    Token passthrough is explicitly forbidden.MCP servers MUST NOT forward an inbound client token to upstream APIs. If the server calls an upstream API, it must obtain a separate token from the upstream authorization server. This is the single most common enterprise design failure in MCP deployments observed in the field.
  4. 04
    The confused-deputy attack is a spec-documented risk.A proxy server that reuses a static client ID with an upstream authorization server, combined with cached consent cookies, allows an attacker to bypass user consent. Mitigation requires per-client consent storage and state parameter validation timed before redirect.
  5. 05
    Four topologies cover the majority of real enterprise deployments.Single-tenant for isolated internal teams; multi-tenant row-isolated for SaaS-style multi-customer MCPs; federated gateway for large estates with central audit requirements; edge-cached read-only for high-RPS tool discovery where tools/list responses are stable and safe to cache.

01Spec OverviewMCP 2025-11-25: two transports, one deprecated.

The MCP specification version 2025-11-25 is the current revision. It defines the wire format as JSON-RPC 2.0, UTF-8 encoded, with stateful connections and an explicit capability negotiation phase between client and server on every new session. Servers expose three primitive types to clients: Tools, Prompts, and Resources. Clients expose three primitive types to servers: Sampling, Roots, and Elicitation. This direction matters — Sampling, Roots, and Elicitation are CLIENT features, not server features. A common misconception in enterprise architectures is treating them as server capabilities, which produces incorrect access-control reasoning.

The spec defines exactly two standard transports. stdio is the local process transport — the MCP host manages the server process and communicates over standard input and output. It is the appropriate transport for local agent tooling and development workflows, and credentials SHOULD be read from the environment rather than managed via OAuth. Streamable HTTP is the remote transport, introduced in the 2025-11-25 revision to replace the earlier HTTP+SSE (Server-Sent Events) transport from the 2024-11-05 revision. HTTP+SSE is now deprecated — backward compatibility is maintained through the MCP-Protocol-Version header mechanism, but new servers should not implement it. Clients that send an HTTP request without the version header are assumed to be speaking the 2025-03-26 protocol, not 2025-11-25.

Streamable HTTP introduces session management via the MCP-Session-Id header. Servers MAY assign a session ID in the InitializeResult response; clients MUST echo it on subsequent requests. Servers MAY terminate a session at any time — the client will receive an HTTP 404 and should start a fresh session. For clean shutdown, clients SHOULD send an HTTP DELETE to the session endpoint. DNS-rebinding protection requires servers to validate the Origin header on all incoming connections, returning HTTP 403 on invalid origins, and to bind to 127.0.0.1 rather than 0.0.0.0 when running locally.

Spec clarification — what servers and clients each expose

Servers expose Tools, Prompts, and Resources. Clients expose Sampling, Roots, and Elicitation. These directions are fixed by the spec. Treating Sampling or Elicitation as server features produces incorrect security boundaries — the spec's source at modelcontextprotocol.io/specification/2025-11-25 is the authoritative reference.

02OAuth 2.1OAuth 2.1 with mandatory PKCE S256.

Authorization for remote MCP servers is built on OAuth 2.1 (draft-ietf-oauth-v2-1-13), layered with four additional RFCs the spec references by name. This is not optional architecture guidance — the MCP spec's authorization section defines these as MUST requirements for HTTP transport implementations that support authorization.

The PKCE requirement deserves specific attention. MCP clients MUST use PKCE with the S256 code challenge method. Before initiating an authorization request, the client MUST verify PKCE support by inspecting the code_challenge_methods_supported field in the authorization server metadata document. If that field is absent or does not include S256, the client MUST refuse to proceed. The plain method is only permitted when S256 is not technically possible — in practice, all modern server environments support it. The plain fallback is an implementation escape hatch, not a design choice.

The supporting RFC ladder specifies four additional requirements: authorization server discovery via RFC 8414 Authorization Server Metadata; resource discovery via RFC 9728 Protected Resource Metadata (covered in section 03); audience-binding of tokens via RFC 8707 Resource Indicators (section 04); and dynamic client registration via RFC 7591, which is a MAY for servers but directly enables zero-configuration client onboarding for enterprise agent fleets. The spec also references draft-ietf-oauth-client-id-metadata-document-00 for client identity, which enables lightweight client verification without a dedicated registration endpoint.

Base framework
OAuth 2.1 (draft-13)
2.1

The authorization framework MCP builds on. Mandates PKCE, removes implicit grant and password grant, tightens redirect URI validation. The draft is stable enough for production implementation.

MUST — HTTP transports
Server discovery
Protected Resource Metadata
9728RFC

MCP servers MUST implement this. Clients use the well-known endpoint to discover the authorization server associated with a given MCP resource URI. Prevents auth-server confusion attacks.

MUST — server + client
Token audience
Resource Indicators
8707RFC

MCP clients MUST include a resource parameter on authorization and token requests. The value is the canonical URI of the MCP server. Audience-bounds the token to a single server — prevents token reuse across services.

MUST — client
Client registration
Dynamic Client Registration
7591RFC

MAY be implemented by servers. Enables agent clients to register themselves without manual admin intervention — critical for large-scale enterprise agent deployments where manual client registration is operationally impractical.

MAY — server

03Auth DiscoveryRFC 9728 Protected Resource Metadata — mandatory for every remote server.

The most underimplemented requirement in current MCP deployments is RFC 9728 OAuth 2.0 Protected Resource Metadata. The spec is unambiguous: "MCP servers MUST implement OAuth 2.0 Protected Resource Metadata, and MCP clients MUST use it for authorization-server discovery." The MCP-specific discovery flow uses the WWW-Authenticatechallenge on the server's well-known endpoint — clients inspect the challenge to find the authorization server URI, then fetch RFC 8414 Authorization Server Metadata from that URI.

The practical impact: clients can bootstrap authorization for any compliant MCP server without out-of-band configuration. The server self-describes which authorization server governs it via its well-known document. For enterprise deployments, this matters because it decouples client configuration from server deployment — adding a new internal MCP server doesn't require updating every agent client's config file.

The spec's security section notes that authorization servers fetching metadata documents SHOULD consider Server-Side Request Forgery (SSRF) risks — a subtlety that matters when the gateway pattern (section 06) aggregates PRM documents from multiple internal MCP servers. Enterprise implementations should validate that well-known URIs resolve to internally-controlled hosts before fetching.

04Token BindingRFC 8707 Resource Indicators — preventing token misuse.

Even with RFC 9728 in place, a token issued for one MCP server can theoretically be reused at another server that trusts the same authorization server — if the token does not carry an explicit audience claim binding it to the intended resource. RFC 8707 Resource Indicators closes that gap. MCP clients MUST include a resource parameter on both the authorization request and the token request. The value is the canonical URI of the MCP server — for example, https://mcp.internal.example.com/analytics.

Authorization servers that support RFC 8707 embed the resource URI as the aud(audience) claim in the issued access token. The MCP server MUST validate that the token's audience matches its own canonical URI on every request. A token issued formcp.internal.example.com/analytics will be rejected by mcp.internal.example.com/crmbecause the audience doesn't match — even if both servers share the same authorization server.

For multi-tenant deployments (pattern 2 in section 06), this mechanism extends naturally: the resource parameter can include a tenant-specific path segment, and the authorization server issues tenant-scoped tokens. Server-side validation of the audience claim then enforces tenant isolation at the token level before any row-level filtering is applied — providing defense in depth without additional application logic.

05Security BoundaryToken passthrough is explicitly forbidden.

The MCP spec's security section states it plainly: "MCP servers MUST NOT accept any tokens that were not explicitly issued for the MCP server." A companion requirement governs the upstream direction: "If the MCP server makes requests to upstream APIs, it may act as an OAuth client to them. The access token used at the upstream API is a separate token, issued by the upstream authorization server. The MCP server MUST NOT pass through the token it received from the MCP client."

The security rationale is the confused-deputy principle at the token level. If an MCP server proxies the inbound client token to an upstream API, the upstream API receives a token whose permissions it cannot independently validate — it is trusting the MCP server's judgment about the client's identity rather than the authorization server's. A compromised MCP server, or a server tricked by a prompt injection, can then make requests to upstream APIs with a client's full permission scope, in ways the client never authorized.

The correct pattern is token exchange: the MCP server authenticates the inbound client token against its own authorization server, extracts the client identity and requested scope, then independently obtains a new short-lived token from the upstream API's authorization server scoped to the specific operation being performed. This token exchange can use the OAuth 2.0 Token Exchange specification (RFC 8693) or service-to-service credentials managed separately. The upstream token's scope SHOULD be the minimum required for the tool call being executed — not the client's full scope.

MCP servers MUST NOT pass through the token received from the MCP client to upstream APIs. The upstream token is a separate, freshly issued credential — audience-bound to the upstream resource alone.Digital Applied synthesis, May 17, 2026

06Enterprise TopologiesFour patterns that cover 95% of enterprise deployments.

Most published MCP guides describe a single-tenant tutorial and stop. Real enterprise estates need to reason across four deployment archetypes, each with different auth assignments, caching postures, and failure modes. The matrix below maps the four patterns against the spec requirements established in sections 01-05. For the engineering-level enterprise agent platform reference architecture, including ingress, orchestration, and observability layers, see our companion post.

Pattern 1
Single-tenant server
stdio or Streamable HTTP · one team, one surface

One MCP server serves one internal team. OAuth 2.1 + PKCE S256 with the corporate IdP. Resource URI maps to the server's canonical URL; RFC 8707 audience binds the token. Optional in-memory tools/list cache. Trade-off: simplest to operate — but doesn't scale beyond one team without replication.

Start here for internal tooling
Pattern 2
Multi-tenant row-isolated
Streamable HTTP · per-tenant resource parameter

One MCP server fronts many customers. Per-tenant resource parameter in the token request binds the audience claim to a tenant-specific URI. Server validates aud before applying row-level storage filters. No shared cache for mutating calls. Trade-off: strict audience validation must be enforced server-side; a missing check exposes cross-tenant data.

SaaS analytics MCP, multi-org agents
Pattern 3
Federated gateway
Streamable HTTP · gateway validates JWT, re-issues tokens

A single gateway implements RFC 9728 PRM and RFC 8707 for all downstream MCP servers. It validates the inbound client JWT, re-issues short-lived server-specific tokens via token exchange, and fans out to the appropriate downstream server. Gateway caches JWKS and PRM documents. Trade-off: central audit and policy enforcement, but the gateway is a single point of failure and a high-value attack target.

10+ server estates, central audit
Pattern 4
Edge-cached read-only
Streamable HTTP · CDN caches tools/list only

Read-heavy MCP servers with stable tool surfaces fronted by an edge cache (Vercel Edge / Cloudflare). The CDN caches tools/list responses; mutating tools/call requests fall through to origin with full auth. MCP-Session-Id pinning preserves stateful sessions across cache nodes. Trade-off: stale tool surface if tools/list_changed notifications are not honored — cache invalidation must be wired to the notification event.

High-RPS discovery, dev agent fleets

The pattern selection is not arbitrary — each emerges from a different combination of tenancy model, regulatory posture, and throughput requirement. Pattern 1 is the correct starting point for any new internal MCP server and can be extended to pattern 2 or 3 if the tenancy or scale requirements evolve. Teams building their first MCP infrastructure should resist the temptation to start with a federated gateway — the operational complexity is significant and the audience for central audit is usually the security team, not the end-user engineers. For guidance on how to architect and build this infrastructure in a production environment, our web development team has direct experience deploying patterns 1-3 for enterprise clients.

Pattern 4 is specifically suited for scenarios where tool-discovery traffic significantly outweighs tool-invocation traffic — common in large agent fleets where many agents enumerate the available tool surface before deciding which server to invoke. The read-only nature of tools/list makes it safe to cache, but the cache TTL must be chosen conservatively and wired to the tools/list_changed notification to avoid serving stale tool definitions to agents.

Our MCP server security best practices guide covers the full hardening surface for all four patterns, including session-ID format recommendations (<user_id>:<session_id> binding), input validation on tool arguments, and output sanitization before sending results to the LLM context.

07Security PatternConfused-deputy mitigation — what the spec actually requires.

The confused-deputy problem in MCP occurs when a proxy server acts as both a relying party for inbound MCP clients and an OAuth client to an upstream authorization server. The spec documents the specific attack: a proxy with a static client ID at the upstream AS plus dynamic registration for inbound MCP clients plus cached consent cookies enables an attacker to skip user consent by replaying a session. The fix is not a single control — it requires two specific mitigations working together.

Control 1
Per-client consent storage

The proxy MUST store consent decisions per MCP client, not per-session. A cached consent cookie for one MCP client MUST NOT satisfy a consent check for a different MCP client connecting to the same proxy. Without this, an attacker who obtains a session cookie for one client can reuse it to satisfy consent for a different, more permissive client registration.

Isolate consent records by client_id
Control 2
State parameter validation timing

The proxy MUST validate the state parameter in the authorization callback before completing the redirect to the MCP client. The state ties the authorization response to a specific client's authorization request and prevents an attacker from injecting a different authorization code into an in-progress flow. Validation must happen before redirect, not after.

Validate state before redirect completes
Control 3
Tool annotation distrust

MCP clients MUST treat tool annotations as untrusted unless they come from a verified trusted server. A compromised or malicious MCP server can alter tool annotations to mislead the agent about a tool's behavior, permissions required, or data it accesses. Clients should prompt for user confirmation on sensitive tool invocations regardless of what annotations claim.

Treat annotations as untrusted by default
Control 4
Session ID cryptographic hygiene

MCP servers that implement authorization MUST use cryptographically random session IDs. The spec recommends binding session IDs to user-specific information using a composite key format — for example, storing sessions under a key that includes the user ID. This prevents session-fixation attacks where an attacker pre-generates a session ID and tricks a legitimate user into authenticating it.

Use composite user_id:session_id key format
Why this matters for enterprise MCP proxy deployments

The federated gateway pattern (Pattern 3) is the most exposed to confused-deputy risk because it, by design, acts as a single OAuth client to upstream authorization servers while serving many inbound MCP clients. Before deploying a federated gateway, the team owning the implementation should validate that all four controls above are present in the codebase — not just the OAuth 2.1 flow itself. Our 75-point MCP security audit checklist includes a dedicated section on proxy and gateway confused-deputy controls.

08Transport HistoryFrom HTTP+SSE to Streamable HTTP — what changed and when.

The MCP transport history is short but consequential. The original spec (2024-11-05) defined HTTP+SSE as the remote transport: a standard HTTP POST endpoint for client-to-server messages and a separate Server-Sent Events endpoint for server-to-client notifications. The 2025-03-26 revision replaced HTTP+SSE with Streamable HTTP, which unifies the two endpoints into a single connection that supports bidirectional streaming. The 2025-11-25 revision formalized Streamable HTTP as the only supported remote transport and officially deprecated HTTP+SSE — though it remains backward-compatible via the protocol-version header.

The practical consequence: any new MCP server built today should use Streamable HTTP. Existing servers that implement HTTP+SSE continue to function for clients that send the legacy protocol version, but teams should plan migration. The enterprise patterns in section 06 all assume Streamable HTTP — pattern 4's edge-caching model in particular depends on the unified endpoint architecture that Streamable HTTP provides.

MCP transport evolution · 2024-11-05 → 2025-11-25

Source: MCP specification changelog, modelcontextprotocol.io
HTTP+SSE (2024-11-05)Original remote transport · two separate endpoints · now deprecated
Deprecated
Streamable HTTP introduced (2025-03-26)Unified bidirectional endpoint · replaced HTTP+SSE
2025-03-26
HTTP+SSE formally deprecated (2025-11-25)Backward-compat only via MCP-Protocol-Version header
2025-11-25
Streamable HTTP — current standard (2025-11-25)MCP-Session-Id · Origin validation · full auth mandate
Current

This timeline matters for enterprise procurement teams evaluating MCP vendors and third-party server implementations. A server that still advertises HTTP+SSE as its primary transport is running on a deprecated foundation — either the vendor hasn't updated to the current spec, or it's targeting backward compatibility with a specific older client. Neither is a disqualifying condition, but both warrant a question about the vendor's spec-tracking cadence. For the full picture of how MCP adoption is evolving at the enterprise level, see our Q3 2026 MCP adoption forecast.

09Operational SecurityRate limiting, scope minimization, and progressive least-privilege.

The spec's security best practices section treats scope minimization as a first-class concern, not an afterthought. The recommended model is progressive least-privilege: issue a minimal initial scope set (for example, mcp:tools-basic) on first authorization, then use incremental WWW-Authenticate scope challenges to request elevation only when a specific operation requires it. Servers SHOULD avoid publishing the full list of available scopes in their scopes_supportedmetadata — this limits the information available to an attacker enumerating the server's attack surface.

Rate limiting on tool invocations is a separate but related requirement. The spec explicitly places server-side rate limiting in the security category: servers MUST implement access controls and rate-limit tool invocations. For the federated gateway pattern, the gateway is the natural enforcement point for cross-server rate limits — individual downstream servers can enforce per-server limits while the gateway enforces aggregate limits per client identity. For pattern 2 (multi-tenant), per-tenant rate limits at the storage access layer prevent one tenant's agent workload from consuming capacity that degrades other tenants.

Tool input validation is the third control in this tier. Servers MUST validate all tool inputs against their declared JSON Schema before executing the tool logic. This is not just a correctness concern — it is a security boundary. An agent that has been injected with a malicious prompt can craft tool arguments that, if not validated, trigger unexpected behavior in the downstream API the MCP server wraps. Clients, for their part, SHOULD prompt for user confirmation on sensitive tool operations and SHOULD display tool inputs to the user before sending them to the server.

For teams building out their first enterprise MCP deployment, our MCP server anti-patterns guide catalogs the specific design mistakes we encounter most often — from overly broad initial scopes to missing input validation on tool arguments. Our AI transformation service includes architecture review for MCP deployments at this layer, specifically the auth, scope, and rate-limiting posture.

MCP infrastructure in 2026

MCP has graduated from a single-vendor protocol to enterprise infrastructure.

The 2025-11-25 spec is mature. The authorization framework — OAuth 2.1 + PKCE S256 + RFC 9728 + RFC 8707 + the token passthrough prohibition — is not aspirational guidance; it is a defined, mandatory security posture for remote MCP servers. The four enterprise topologies in this post cover the deployment surface that most organizations will encounter: single-tenant for isolated internal tooling, multi-tenant row-isolated for SaaS-style multi-organization deployments, federated gateway for large estates that need central audit, and edge-cached read-only for high-RPS tool discovery. Each pattern has a spec-correct auth assignment and a known trade-off — the selection is an architectural decision, not a preference.

What comes next is worth watching. MCP tunnels — the Anthropic-led capability expected to be previewed at Code with Claude London (May 19-20, 2026) — may extend the Streamable HTTP transport to support edge federation natively, which would significantly change the economics of pattern 3 and 4. Edge-MCP federation, where a CDN node handles not just tools/list caching but also partial JWT validation and scope enforcement, is the logical endpoint of the current transport trajectory. And MCP Registry-as-a-service — a managed version of the official registry.modelcontextprotocol.io — could abstract the RFC 9728 discovery layer for organizations that prefer not to operate their own well-known endpoints. The security spec is settled; the operational tooling around it is still evolving.

For now, the practical move is to implement the four controls from section 07 before deploying any MCP server that proxies multiple clients, audit existing deployments against the token passthrough prohibition, and verify that RFC 8707 Resource Indicators are present in every token request rather than treating audience binding as optional. The spec is public and precise — the gap between what it requires and what most deployments actually implement is where enterprise risk concentrates.

Build spec-correct MCP infrastructure

Enterprise MCP is not a tutorial problem — it's an architecture problem.

Our team helps enterprises design, audit, and deploy MCP server infrastructure — from single-tenant internal tooling to federated gateways across large agent estates. Architecture review, auth hardening, and production rollout.

Free consultationExpert guidanceTailored solutions
What we work on

MCP enterprise engagements

  • OAuth 2.1 + PKCE + RFC 9728/8707 implementation review
  • Token passthrough audit — existing MCP server estate
  • Multi-tenant row-isolated MCP for SaaS products
  • Federated gateway design for 10+ server estates
  • Rate limiting, scope minimization, and tool input validation
FAQ · MCP enterprise patterns

MCP enterprise questions answered.

The Model Context Protocol (MCP) is an open standard that gives AI agents a uniform interface to connect to external tools, data sources, and services. It defines a JSON-RPC 2.0 wire format with stateful connections and capability negotiation. Servers expose Tools, Prompts, and Resources; clients expose Sampling, Roots, and Elicitation. The spec is maintained at modelcontextprotocol.io and was originally introduced by Anthropic in late 2024. As of May 2026, the ecosystem has reportedly grown to over 10,000 tracked MCP servers across public registries including PulseMCP, Smithery, and the official registry.modelcontextprotocol.io.