AI agent deployment model selection — SaaS, self-hosted, or hybrid — is the single decision separating enterprises that have AI agents in production from the 62–68% that remain stuck in pilots. According to Gartner, 17% of organizations have deployed AI agents today; 60%+ expect to within two years. The bottleneck is rarely the model or the orchestration framework. It is the deployment architecture decision that teams keep deferring because no structured framework existed to make it.
The stakes have shifted in 2026. Every major SaaS vendor — Agentforce 360, Microsoft Copilot Studio, SAP Joule — now ships a managed-control-plane / private-data-plane hybrid mode. Every self-hosted leader — Anthropic's public-beta sandboxes, LangGraph Platform (now rebranded as LangSmith Deployment) — now ships managed-runtime escape hatches. The binary is obsolete. The question is not SaaS vs. self-hosted; it is which configuration of the hybrid spectrum fits your compliance posture, team size, and cost model.
This guide delivers three proprietary artifacts: a 12-dimension decision matrix that extends the typical 6-dimension scorecard with vendor lock-in, observability, eval maturity, governance, scale ceiling, model-choice freedom, and data residency; a 36-cell scoring table (1–5 per cell with justification); and a 16-node terminal recommendation tree that routes by team-size × use-case to a specific platform recommendation. If you have already decided to buy rather than build from scratch, see our enterprise AI agent build vs. buy framework as the prior-stage decision.
- 01The production gap is a deployment decision, not a capability gap.79% of enterprises report AI agent adoption; Gartner puts actual production deployment at 17%. The gap is not attributable to model quality — GPT-4o, Claude 3.7, and Gemini 2.5 Pro are all capable enough for most enterprise agent tasks. The gap is the deployment-model decision: organizations that cannot answer 'SaaS, self-hosted, or hybrid, and why?' are the ones stuck in pilots. This matrix is designed to close that decision gap.
- 02Hybrid wins the weighted enterprise scorecard — but rarely by the margin people expect.Across 12 enterprise-weighted dimensions, Hybrid scores 4.2/5, Self-Hosted 3.8/5, and SaaS 3.0/5. Hybrid's lead is real but concentrated: it wins decisively on compliance flexibility, data residency, and governance stacking. On time-to-value, SaaS still leads by a wide margin (days vs. months). The right answer for a 5-person team shipping a customer support bot is still SaaS — the matrix tells you when to graduate.
- 03Every SaaS leader now ships a hybrid mode — the binary is obsolete.Agentforce 360 for AWS (announced AWS re:Invent, December 2025) runs entirely on AWS infrastructure with Salesforce's Trust Boundary — it is SaaS orchestration on a private-cloud data plane. Microsoft Copilot Studio runs computer-use agents on Microsoft-managed infrastructure with no BYOC option, but its MCP server ecosystem is open. Anthropic's self-hosted sandboxes (public beta, May 19, 2026) move tool execution to the customer's environment while keeping orchestration on Anthropic infrastructure. The labels 'SaaS' and 'self-hosted' now describe control-plane ownership, not the full architecture.
- 04Compliance thresholds are the most reliable decision triggers.Three compliance events trigger an automatic upgrade from SaaS to Hybrid: crossing any single one of HIPAA-eligible data, FedRAMP, or EU AI Act Annex III high-risk classification. EU AI Act high-risk obligations under Articles 6–15 become enforceable August 2, 2026, subject to the AI Act Omnibus amendments agreed in trilogue May 7, 2026. Penalties reach 7% of global annual turnover. The compliance dimension is where the SaaS-vs.-Hybrid decision most often has a definitive answer regardless of team size.
- 05The team-size × use-case matrix provides 16 terminal node recommendations.Rather than stopping at a scorecard, this guide routes each combination of team size (1–10, 10–50, 50–200, 200+) and use case (customer support, coding agents, internal ops, customer-facing) to a specific platform recommendation with a one-line rationale. The 16-node tree is the actionable output of the 36-cell matrix. Most published deployment frameworks omit this routing step — this one does not.
01 — The Problem79% adopted, 17% in production — the deployment decision is the bottleneck.
Gartner reported in August 2025 that 17% of organizations have deployed AI agents, with 60%+ expecting to do so within two years. Industry surveys from early 2026 place self-reported adoption at 79%. The 62-percentage-point gap between claimed adoption and confirmed production deployment is not primarily explained by model capability shortfalls — the frontier models available in 2026 are capable of handling the vast majority of enterprise agent tasks.
The gap is explained by three factors that all trace back to the deployment architecture decision: (1) data residency requirements that teams cannot meet with a default SaaS configuration; (2) compliance evidence requirements (HIPAA BAA coverage, SOC 2 inheritance, EU AI Act Article 6 classification) that the pilot-to-production handoff exposes; (3) operational ownership gaps — who is responsible for the agent when it fails in production, and does that team have the infrastructure access to diagnose the failure.
Forrester VP Emerging Tech Brian Hopkins predicted in the 2026 Predictions report that “in 2026, AI will inevitably lose its sheen, trading its tiara for a hard hat… companies will distribute their bets across agentic ecosystems.” That bet-distribution is exactly what the SaaS / self-hosted / hybrid spectrum enables — but only if teams can make a principled architectural decision before the pilot ends. Gartner additionally forecasts that 40% of enterprise applications will include task-specific agents by 2026, up from less than 5% in 2025. The organizations that close the production gap in 2026 will ship the majority of those agents. Those that do not will watch vendors ship into their workflows instead.
AI agents adopted (self-report)
Cross-industry surveys from early 2026. Includes pilot, PoC, and internal-tooling deployments alongside production workloads.
Organizations with agents in production
Gartner, August 2025. Defined as agent workflows handling live business transactions, not sandboxed demos or internal prototypes.
Enterprise apps with task-specific agents
Gartner forecasts 40% of enterprise applications will include task-specific agents by 2026, up from under 5% in 2025.
EU AI Act high-risk enforcement date
Articles 6–15 obligations become enforceable August 2, 2026 (subject to AI Act Omnibus, agreed May 7, 2026). Penalties up to 7% of global annual turnover.
02 — Model DefinitionsSaaS, Self-Hosted, Hybrid — precise definitions for 2026.
The labels have become imprecise in 2026 because every leader on both sides of the original binary now ships the other side as an option. The definitions below use control-plane ownership as the primary axis, which is the dimension that matters most for compliance evidence, observability, and operational ownership.
Vendor-managed control plane and data plane
The vendor operates both the orchestration layer (agent scheduling, tool dispatch, memory, logging) and the data plane (tool execution environment, storage). The customer configures agents via a vendor-provided builder or SDK but does not own or operate the infrastructure. Examples: Agentforce 360 standard tier, Microsoft Copilot Studio (managed-only for computer-use tools), SAP Joule Studio. Time-to-value: days to two weeks.
Customer-managed control plane and data plane
The customer deploys and operates both layers in their own infrastructure (cloud VPC, on-premises, or sovereign cloud). The customer selects the model, manages the Kubernetes cluster, writes orchestration code, and owns observability end-to-end. Examples: LangSmith Deployment fully self-hosted tier, Microsoft Agent Framework on AKS, CrewAI open-source on customer k8s. Infra minimum at 30K monthly executions: reportedly $425–740K/yr for a dedicated DevOps team.
Vendor control plane, customer data plane
The vendor manages the orchestration layer (scheduling, policies, billing) while the customer operates the data plane (tool execution, storage, model inference) in their own VPC or private cloud. Compliance evidence is split: vendor provides control-plane SOC 2 / BAA coverage; customer provides data-plane evidence. Examples: Agentforce 360 for AWS, LangSmith Deployment hybrid tier, LangGraph + Anthropic self-hosted sandboxes (public beta). Time-to-value: 2–6 weeks for control plane, 4–8 weeks for data plane.
03 — Decision FrameworkWhy 12 dimensions — the six gaps in most published matrices.
Most published deployment decision matrices stop at six or eight dimensions: cost, time-to-value, customization, compliance, lock-in, and integration breadth. That set is necessary but insufficient for a 2026 enterprise decision. The six dimensions that published matrices typically omit are where the critical failure modes cluster in production: observability (vendor dashboard vs. OpenTelemetry ownership), eval workflow maturity (vendor eval suite vs. pluggable Braintrust/Inspect), governance (policy-as-code stack vs. DIY guardrails), scale ceiling (requests per second and concurrent agent limits), model-choice freedom (BYOK, BYO weights, or vendor-curated shortlist), and data residency (true sovereignty vs. region selection with CLOUD Act exposure).
The full 12-dimension list: (1) Cost / TCO, (2) Time-to-Value, (3) Customization depth, (4) Compliance flexibility, (5) Vendor lock-in risk, (6) Integration breadth, (7) Observability native vs. OTel, (8) Eval / test workflow maturity, (9) Governance (policy-as-code), (10) Scale ceiling, (11) Model-choice freedom, (12) Data residency / sovereignty. Each is scored 1–5 per model in §04. For deeper governance methodology, see our AI agent governance and compliance guide.
One clarification on lock-in: even fully self-hosted architectures carry lock-in risk. LangGraph syntax, the AutoGen-to-Microsoft-Agent-Framework migration cost (following the March 2026 split), and the Kubernetes-specific operational assumptions are all forms of lock-in. The dimension is scored 5 only when orchestration code is portable across runtimes with minimal rewrite — never when “self-hosted = zero lock-in.” It does not.
04 — Scoring MatrixThe 36-cell matrix: 12 dimensions × 3 models scored 1–5.
Each cell below contains a score (1 = worst, 5 = best for an enterprise buyer) and a ≤8-word justification. The weighted average uses equal weighting across the 12 dimensions; an organization can re-weight by the dimensions most material to their context. The Hybrid model's 4.2 average reflects its sweep of the compliance, residency, governance, and scale dimensions — not a lead across all 12. For adjacent stack decisions, see the AI agent stack decision tree.
Dimension 1 — Cost and total cost of ownership
SaaS: 3/5 — Predictable license + consumption pricing, but vendor margin is baked in. Agentforce 360: $125–$550/user/mo + $2/conversation. Copilot Studio: $0.01/credit, $200/mo for 25K credits. Self-Hosted: 2/5 — Cheapest at hyperscale; add a full DevOps team at medium scale (reportedly $425–740K/yr for 30K monthly executions). Hybrid: 4/5 — Control-plane subscription plus customer-VPC compute; mid-range TCO with compliance value offsetting cost.
Dimension 2 — Time from decision to production
SaaS: 5/5 — Days to two weeks with prebuilt connectors and no infrastructure provisioning. Self-Hosted: 2/5 — Three to six months for a basic production deployment; longer for enterprise-grade observability and security hardening. Hybrid: 4/5 — Two to six weeks for control-plane setup, plus four to eight weeks for data-plane provisioning. SaaS wins this dimension decisively; self-hosted loses it decisively.
Dimension 3 — Agent and tool customization
SaaS: 2/5 — Vendor-shaped builder with limited primitives; custom tools possible but bounded by the vendor's SDK surface area. Self-Hosted: 5/5 — Anything Python or TypeScript can express; BYO orchestration logic, BYO tool schemas, BYO eval pipelines. Hybrid: 4/5 — Vendor SDK with custom tool injection via MCP servers or data-plane APIs; more flexible than SaaS, less than full self-host.
Dimension 4 — Compliance posture ownership
SaaS: 3/5 — Inherit vendor BAA and SOC 2; one-size compliance posture across all customers. Works for common profiles (HIPAA with BAA, SOC 2 Type II), fails for custom evidence requirements. Self-Hosted: 5/5 — Customer designs the compliance posture from scratch; highest flexibility at highest cost. Hybrid: 5/5 — Vendor BAA on control plane; customer owns data-plane evidence. Best of both: vendor compliance baseline plus custom data-plane controls.
Dimension 5 — Portability and switching costs
SaaS: 1/5 — Proprietary builder, proprietary pricing units, data-egress costs, and re-implementation cost when switching. Highest lock-in of the three. Self-Hosted: 5/5 — Model-agnostic; container-portable if orchestration code is framework-neutral. Note: LangGraph syntax and AutoGen/MAF migration costs are real lock-in — score reflects portability ceiling, not average case. Hybrid: 3/5 — Control plane locked to vendor; data plane portable if designed for it.
Dimension 6 — Out-of-the-box connectors and actions
SaaS: 5/5 — Agentforce 360: 100+ prebuilt actions. Copilot Studio: 1,000+ MCP servers, full M365 graph. SAP Joule: 2,400+ skills across ERP modules. Self-Hosted: 2/5 — MCP protocol gives you everything you can build; zero prebuilt connectors. Hybrid: 4/5 — Vendor connector library plus custom MCP servers on the data plane. SaaS wins integration breadth for standard enterprise applications.
Dimension 7 — Observability out-of-the-box
SaaS: 4/5 — Command Center (Agentforce), Foundry telemetry (Microsoft), both built-in with dashboards, trace views, and alerting. Self-Hosted: 3/5 — OpenTelemetry plus LangSmith, Langfuse, or Braintrust opt-in; more powerful ceiling but requires configuration. Hybrid: 4/5 — Vendor control-plane traces plus customer-side OTel for the data plane; dual visibility but dual maintenance.
Dimension 8 — Evaluation and testing workflow
SaaS: 3/5 — Vendor eval suite with opinionated metrics; limited ability to plug in external eval frameworks. Self-Hosted: 4/5 — Braintrust, Inspect, Promptfoo, and LangSmith all plug in cleanly; custom metrics straightforward. Hybrid: 4/5 — Vendor eval on control plane plus own eval pipelines on data plane. Self-hosted and hybrid share the ceiling; SaaS eval is functional but constrained.
Dimension 9 — Policy enforcement and guardrails
SaaS: 4/5 — Agentforce Trust Layer and Microsoft Foundry policies are turnkey; enterprise-grade guardrails without custom implementation. Self-Hosted: 3/5 — Customer writes their own guardrails (NeMo Guardrails, constitutional AI prompting, custom classifiers); powerful but operational burden is entirely on the customer team. Hybrid: 5/5 — Vendor policy on control plane stacked with customer policy on data plane; defense-in-depth governance that neither pure model can match.
Dimension 10 — Requests per second and concurrent agents
SaaS: 4/5 — Vendor autoscale; throttles apply per tier (Agentforce concurrency limits, Copilot Studio credit exhaustion). Self-Hosted: 5/5 — Bounded only by the customer's cluster; no vendor-imposed concurrency caps. Hybrid: 5/5 — Control plane scales separately from the data plane; can independently scale each layer. Self-hosted and hybrid both win; SaaS is sufficient for most teams but has throttle ceilings.
Dimension 11 — Model provider and weight flexibility
SaaS: 2/5 — Vendor-curated model list (Agentforce: Atlas, gpt-4o, Amazon Bedrock library; Copilot Studio: Azure OpenAI models). Cannot BYO weights. Self-Hosted: 5/5 — Any model, any provider, BYO weights, BYO API key; full model-choice sovereignty. Hybrid: 3/5 — Vendor's certified model shortlist on the control plane; BYOK possible on the data plane for some vendors. Self-hosted wins model choice decisively.
Dimension 12 — Data sovereignty and residency
SaaS: 3/5 — Vendor region picker (Hyperforce / Azure / AWS region selection); CLOUD Act exposure for US-parent vendors processing EU data. Self-Hosted: 5/5 — True sovereignty; customer runs the container in a sovereign cloud, on-premises, or private VPC. AWS European Sovereign Cloud (launched 2026) is a viable option. Hybrid: 5/5 — Control plane in vendor region; data plane in customer VPC or sovereign cloud. No raw customer data leaves the customer perimeter.
Aggregating across all 12 dimensions: SaaS weighted enterprise score 3.0/5, Self-Hosted 3.8/5, Hybrid 4.2/5. Hybrid's lead reflects its sweep of compliance flexibility, governance stacking, scale ceiling, and data residency — the four dimensions where two-layer architecture outperforms either pure model. SaaS leads only on time-to-value and integration breadth, but those two dimensions are decisive for teams under 50 engineers without a dedicated infrastructure function.
Deployment model weighted enterprise scores
Digital Applied 12-dimension deployment decision matrix, May 202605 — SaaS Vendor ProfilesAgentforce 360, Copilot Studio, Joule — SaaS leaders profiled.
The three SaaS leaders in enterprise AI agent deployment occupy distinct niches: Agentforce 360 is built for Salesforce-centric organizations and customer-facing workflows; Microsoft Copilot Studio covers Microsoft 365 workflows and now computer-use agents (general availability May 13, 2026); SAP Joule is the specialist for SAP ERP estates. Choosing between them is rarely a capability question — it is a question of which system of record your agents need to interact with.
Salesforce Agentforce 360.GA'd at Dreamforce '25 on October 14, 2025, Agentforce 360 ships Agentforce Builder, Agent Script, Agentforce Voice, and Intelligent Context as a bundled platform. ARR reportedly reached $800M (up 169% YoY) per Salesforce Q3 FY2026 results, with 29,000 deals closed since launch and reportedly 2.4 billion agentic work units delivered. Pricing: Flex Credits at $500 per 100,000 credits (approximately $2 per conversation); per-user Agentforce add-on at $125–$150/month; Agentforce 1 Enterprise Edition at $550/user/month; FedRAMP-High public-sector tier at $650/user/month. The Trust Layer is the governance component: it enforces data masking, toxicity filtering, and policy-as-code rules before any LLM call leaves the Salesforce perimeter.
Microsoft Copilot Studio.Computer-using agents in Microsoft Copilot Studio reached general availability on May 13, 2026 — expanded to all commercial geographies in Power Platform. These agents “treat websites and desktop applications as tools,” per EVP Charles Lamanna — extending RPA with capabilities that overcome “the fragility of UI elements and complex dynamic interfaces.” Pricing switched to Copilot Credits on September 1, 2025: $0.01 per credit, prepaid 25,000-credit pack at $200/month (no rollover). Computer-use runs on Microsoft-managed infrastructure only — no bring-your-own-compute option for that specific capability. For a detailed technical breakdown, see our Copilot Studio computer-use GA deep dive.
SAP Joule Studio.SAP Joule has 40+ specialized agents and reportedly 2,400 skills across finance, HR, procurement, and supply chain. Joule Studio (the managed development environment) reached Early Adopter Care in May 2026 with GA targeted Q3 2026 — do not write “Joule Studio GA today.” Joule A2A (Agent-to-Agent protocol) GA is planned Q4 2026. Only approximately 3% of SAP customers reportedly use SAP Business AI in production today, while 77% of AI-active enterprises rely on non-SAP tools like Microsoft Copilot — a significant uptake gap that Joule Studio is designed to close.
“This is what AI was meant to be… we are creating digital workers, digital labor — a multi-trillion-dollar TAM.” Salesforce reportedly plans to spend approximately $300M on Anthropic tokens in 2026, “almost entirely to power AI-assisted coding.” The Agentforce 360 for AWS joint statement adds: “customer data never leaves the organization's secure environment, no external provider can use customer data for training.”
06 — Self-Hosted Vendor ProfilesAnthropic sandboxes and LangSmith Deployment — self-hosted leaders profiled.
The self-hosted leader landscape in 2026 is defined by two announcements that blurred the boundary between managed and self-hosted: Anthropic's public-beta self-hosted sandboxes (May 19, 2026) and LangGraph Platform's October 2025 rebrand as LangSmith Deployment — the production-deployment surface of the LangChain ecosystem. Both offer self-hosted modes while maintaining managed-runtime escape hatches.
Anthropic self-hosted sandboxes (public beta). Announced at Code with Claude London on May 19, 2026 — the first Anthropic developer event outside the US — the self-hosted sandbox model moves tool execution to the customer's environment while keeping agent orchestration on Anthropic infrastructure. Managed sandbox providers at launch: Cloudflare, Daytona, Modal, and Vercel. Critical caveats: this is a public beta, not GA. Memory is not supported in self-hosted sessions. The sandbox is not available on Claude Platform on AWS at launch. Anthropic stated: “tool execution moves to your own configured environment… inside your perimeter, network policies, audit logging, and security tooling are already in place, and files and repositories don't leave.” For seven production patterns using Anthropic's self-hosted architecture, see our Anthropic self-hosted sandbox production patterns guide.
LangGraph Platform (now LangSmith Deployment). LangGraph Platform reached GA on May 14, 2025 and was rebranded as LangSmith Deployment in October 2025. Use “LangGraph Platform (now LangSmith Deployment)” on first mention to avoid confusion — the underlying graph-based orchestration framework is still called LangGraph. Four deployment tiers: Cloud (managed), Hybrid (SaaS control plane + customer VPC data plane), Fully Self-Hosted, and Developer (free up to 100K nodes/month). Nearly 400 companies deployed agents on the platform during its beta period (June 2024 – May 2025). March 2026 introduced langgraph deploy to supersede langgraph up for cloud production deployments. The platform integrates natively with LangSmith for tracing, evaluation, and dataset management.
Microsoft Agent Framework (MAF).Following the March 2026 AutoGen split, the production successor merging AutoGen and Semantic Kernel is Microsoft Agent Framework — positioned for “millions of steps” with persistent checkpointing. AutoGen v0.7.x enters stable maintenance; AutoGen Studio continues as an experimental tool. For teams migrating from AutoGen, the official migration guide at learn.microsoft.com/agent-framework/migration-guide/from-autogen is the canonical reference. MAF is the self-hosted / Hybrid recommendation for large Microsoft ecosystem organizations.
The 2026 collapse is real: Agentforce 360 for AWS is SaaS orchestration on a private-cloud data plane. Anthropic's self-hosted sandboxes are self-hosted tool execution on a managed orchestration plane. The binary is obsolete — what remains is the control-plane ownership question.Digital Applied analysis, May 2026
07 — Hybrid Vendor ProfilesAgentforce for AWS, MAF + Foundry, CrewAI — hybrid leaders profiled.
Hybrid deployment in 2026 is not an edge case — it is the default for compliance-bound enterprises. Three platforms have the clearest hybrid architecture story: Agentforce 360 for AWS, Microsoft Agent Framework paired with Azure Foundry, and LangSmith Deployment's hybrid tier. CrewAI also offers a hybrid path via its Enterprise AMP tier.
Agentforce 360 for AWS.Announced at AWS re:Invent in December 2025 and available in AWS Marketplace as of early 2026, Agentforce 360 for AWS runs entirely on AWS infrastructure inside the Salesforce Trust Boundary. Amazon Bedrock provides model access; all LLM traffic reportedly stays within Salesforce's private AWS cloud. The joint statement confirms: “customer data never leaves the organization's secure environment, no external provider can use customer data for training.” This is the canonical hybrid architecture: Salesforce manages the control plane (Trust Layer, Command Center, billing), and the customer's AWS account hosts the data plane. Pricing follows the standard Agentforce 360 model plus AWS infrastructure costs.
Microsoft Agent Framework + Azure Foundry.Foundry serves as the managed control plane — providing policy enforcement, telemetry, and the evaluation harness — while MAF on AKS (Azure Kubernetes Service) or on-premises serves as the data plane. Forrester's Total Economic Impact study of Microsoft Foundry (February 2026) found 327% ROI over three years, with $15.7M attributed to developer productivity and payback reportedly as fast as six months. 67% of surveyed organizations cited AI security/privacy/governance as a top adoption driver — a direct validation that the control-plane governance value proposition is worth the hybrid complexity.
CrewAI Enterprise (hybrid path). CrewAI Enterprise is custom-priced (contact sales) and includes 10,000 executions per month, up to 50 deployed crews, SOC 2 Type II, SSO, and PII detection. The open-source self-host tier offers free unlimited usage. The managed AMP cloud tier starts at approximately $99/month. CrewAI occupies the mid-market hybrid position — smaller than Salesforce or Microsoft in enterprise reach, but with faster open-source customization and a lower contact-sales threshold.
ServiceNow's approach is also instructive: the company signed multi-year deals with both OpenAI and Anthropic in January 2026, explicitly framing it as customer optionality and anti-lock-in. Snowflake committed $200M to OpenAI while stating it is “intentionally model-agnostic.” Both companies are building hybrid postures at the model-access layer, not just the deployment layer — treating vendor optionality as a first-class architectural requirement.
08 — ComplianceHIPAA, SOC 2, EU AI Act — compliance thresholds that force model upgrades.
Compliance is the dimension that most reliably forces a deployment model upgrade. Three thresholds apply to 2026 enterprise agent deployments: HIPAA-eligibility (healthcare and adjacent sectors), SOC 2 Type II inheritance (most enterprise security requirements), and EU AI Act Annex III high-risk classification (all sectors processing EU-resident data in high-risk categories).
SOC 2 Type II. Anthropic, OpenAI, Salesforce, and Microsoft all hold SOC 2 Type II certifications as of 2026. For most enterprise security questionnaire requirements, SOC 2 Type II inheritance from a SaaS vendor is sufficient. This is not a differentiator between the major SaaS vendors — it is table stakes.
HIPAA-eligibility.HIPAA-eligibility requires a signed Business Associate Agreement (BAA) with the vendor for the specific service handling protected health information. Both Anthropic and Microsoft Azure OpenAI sign BAAs for HIPAA-eligible workloads on designated services. AWS leads HIPAA-eligible service count (100+), Azure covers 80+, and GCP covers approximately 40. Critical clarification: “SaaS is HIPAA-compliant” is inaccurate. The accurate formulation is “HIPAA-eligible with vendor BAA for the listed services.” Not all services from any vendor are HIPAA-eligible by default — verify the BAA scope for each specific service handling PHI.
EU AI Act. High-risk obligations under Articles 6–15 of the EU AI Act become enforceable August 2, 2026. The EU AI Act Omnibus political agreement reached trilogue on May 7, 2026 — final text may shift some enforcement dates, so cite both: August 2, 2026 as the current live date and May 7, 2026 as the Omnibus caveat. Penalties for non-compliance reach 7% of global annual turnover. Annex III high-risk categories include employment-related AI systems, critical infrastructure management, and essential private and public services — categories that cover most enterprise agent deployments in finance, healthcare, and HR. For organizations processing EU-resident data in these categories, hybrid or self-hosted deployment in an EU-resident cloud (AWS European Sovereign Cloud, Azure EU Data Boundary, or sovereign cloud) is the risk-appropriate posture. The AWS European Sovereign Cloud GmbH (a German-incorporated entity, physically and logically separate from other AWS regions) addresses CLOUD Act extraterritoriality concerns for EU customers.
For the full governance policy-as-code framework across all three deployment models, see our AI agent governance and compliance 2026 guide. For the computer-use enterprise playbook including compliance controls for UI-automation agents, see the computer-use enterprise automation playbook.
09 — Recommendation Tree16-node recommendation tree: team size × use case → specific platform.
The matrix scorecard tells you which model wins across 12 dimensions. The recommendation tree tells you which specific platform to deploy given your team size and use case. Four team-size buckets × four use cases = 16 terminal nodes. Each node includes the recommended platform and a one-line rationale. This tree assumes the organization has already decided to use an agent platform rather than build from scratch — for the prior-stage decision, see our enterprise AI agent build vs. buy framework.
Customer support agents
Recommendation: SaaS — Agentforce ESC or Copilot Studio. Rationale: prebuilt CRM and ticketing integrations beat any custom build for a small team. Time-to-value is days. The compliance overhead of self-hosting is not justified at this scale unless the organization is already HIPAA-mandated.
Coding agents
Recommendation: Self-Hosted — Anthropic API with BYOK. Rationale: BYOK keeps token cost predictable; no per-seat tax; Claude Code handles most of the workflow. Anthropic self-hosted sandboxes (public beta) add local execution context. Small coding teams have the technical depth to manage their own API keys without a full DevOps function.
Internal-ops agents
Recommendation: SaaS — Microsoft Copilot Studio. Rationale: lowest time-to-value for IT-adjacent automations (calendar, email, Teams, SharePoint). MCP server ecosystem covers most internal-ops tool integrations without custom code. Computer-use GA (May 13, 2026) adds UI-automation for legacy apps.
Customer-facing chat
Recommendation: SaaS — Agentforce ESC. Rationale: voice + chat in one platform; small team can ship without backend infrastructure. Agentforce Voice handles telephony-connected agents. The Trust Layer covers basic PII masking requirements.
Customer support agents
Recommendation: SaaS — Agentforce 360. Rationale: at 10–50 engineers, the Trust Layer and Command Center observability earn their license fee. The team is large enough to configure and maintain the platform but likely does not have a dedicated infrastructure function to run self-hosted infrastructure at production quality.
Coding agents
Recommendation: Self-Hosted — LangSmith Deployment + Anthropic sandbox (public beta). Rationale: at this team size, differentiated coding-agent behavior (custom code analysis, bespoke tool schemas, custom eval pipelines) justifies the framework investment. Customization beats vendor flow builder for differentiation.
Internal-ops agents
Recommendation: Hybrid — Copilot Studio with custom MCP tunnels. Rationale: native M365 integration covers the Microsoft surface; MCP tunnels (Anthropic research preview) or custom MCP servers handle non-Microsoft systems. The control plane stays managed; custom tools extend it without full self-host.
Customer-facing agents
Recommendation: Hybrid — Agentforce 360 for AWS. Rationale: at 10–50 engineers, brand-controlled UI plus AWS region residency becomes important for customer trust. Agentforce 360 for AWS provides Salesforce Trust Layer governance with AWS-resident data plane. Available in AWS Marketplace as of early 2026.
Customer support agents
Recommendation: Hybrid — Agentforce 360 for AWS. Rationale: at 50–200 engineers, compliance evidence ownership becomes essential for security audits and enterprise procurement. The hybrid model allows the customer to own data-plane evidence while Salesforce manages control-plane compliance.
Coding agents
Recommendation: Self-Hosted — LangSmith Deployment Enterprise self-hosted. Rationale: at mass-parallel coding agent scale, self-hosted TCO wins. Token cost at this team size demands BYOK. The dedicated DevOps function that the team needs to operate at 50–200 engineers covers the infrastructure cost.
Internal-ops agents
Recommendation: Hybrid — LangSmith Deployment Hybrid tier or Microsoft Agent Framework. Rationale: stitch into existing data pipelines with customer-controlled data plane; vendor SLA on control plane for uptime guarantees. MAF is the better choice for Microsoft-ecosystem organizations.
Customer-facing agents
Recommendation: Hybrid — Agentforce 360 for AWS. Rationale: sovereign cloud plus multi-region failover at this scale; the compliance overhead of full self-host is not justified when Agentforce for AWS provides an equivalent data-residency posture with managed control-plane SLA.
Customer support agents
Recommendation: Hybrid — Agentforce 360 for AWS. Rationale: audit and governance at this scale can only be sustained by the hybrid model. The vendor provides control-plane audit trails; the customer provides data-plane evidence. Full self-host introduces operational risk that outweighs the cost savings at 200+ engineers.
Coding agents
Recommendation: Self-Hosted — Microsoft Agent Framework on AKS or LangSmith Deployment fully self-hosted. Rationale: token spend at 200+ engineers demands BYOK and sovereign infrastructure. The team has the DevOps function to sustain it. MAF is the Microsoft-ecosystem choice; LangSmith self-hosted for framework-agnostic organizations.
Internal-ops agents
Recommendation: Hybrid — Microsoft Agent Framework + Azure Foundry control plane. Rationale: Forrester TEI study found 327% ROI over 3 years from the MAF + Foundry combination, with $15.7M from developer productivity. Foundry serves as the managed control plane for policy, telemetry, and eval; MAF handles execution on customer infrastructure.
Customer-facing agents
Recommendation: Hybrid — Agentforce 360 for AWS in EU-resident sovereign region. Rationale: EU AI Act high-risk obligations (enforceable August 2, 2026) and CLOUD Act extraterritoriality concerns force sovereign cloud for customer-facing agents processing EU-resident data at this scale. Agentforce for AWS in the AWS European Sovereign Cloud GmbH provides the compliant hybrid posture.
10 — Cheat Sheet12-line defaults — what to start with before engaging the full matrix.
Before a team engages the full 12-dimension matrix, a one-line default per dimension reduces the decision space to the cases where the answer is non-obvious. These defaults are starting points, not final answers — they apply absent specific organizational context that might invert a dimension's weighting.
Until monthly token bill exceeds the platform license
Default to SaaS until self-hosted infrastructure cost (including DevOps team allocation) is lower than the SaaS license + consumption fees at your actual volume. The crossover point is typically 30K+ monthly agent executions.
The moment you cross any one of HIPAA / FedRAMP / EU AI Act Annex III
Default to Hybrid the moment one of three thresholds is crossed: HIPAA-eligible data, FedRAMP authorization requirement, or EU AI Act Annex III high-risk classification. A signed vendor BAA alone is insufficient for data-plane evidence ownership.
BYOK Hybrid if you ever want to switch models without rewriting
Default to BYOK Hybrid if there is any likelihood of switching model providers (e.g., Anthropic to Google Vertex, or OpenAI to Bedrock). Hybrid control planes with BYOK data planes let you re-point model inference without rewriting agent orchestration code.
Always default to SaaS for time-to-value
If the primary constraint is time — board deadline, competitive pressure, vendor demo requirement — SaaS is the correct default. Days-to-weeks vs. months is a decisive difference. You can always migrate to Hybrid later; you cannot recover lost time.
The remaining eight dimension defaults: (5) Vendor lock-in — default to Hybrid if your organization's AI strategy includes model switching or vendor diversification; (6) Integration breadth — SaaS if you need 50+ prebuilt connectors immediately, self-hosted if you are building non-standard tool integrations; (7) Observability — SaaS if a vendor dashboard is sufficient, self-hosted or Hybrid if OTel ownership matters for existing monitoring infrastructure; (8) Eval — self-hosted or Hybrid if you need to run custom eval datasets against external benchmarks; (9) Governance — Hybrid for two-layer policy stacking, SaaS if vendor Trust Layer covers your policy requirements; (10) Scale — self-hosted or Hybrid if you anticipate concurrent agent counts that exceed vendor-tier throttle ceilings; (11) Model freedom — self-hosted if BYO weights is a firm requirement, Hybrid if BYOK suffices; (12) Data residency — Hybrid or self-hosted in sovereign cloud for EU AI Act Annex III scope, SaaS with vendor region picker for standard requirements.
For teams implementing the AI transformation program that this decision matrix fits into, our AI transformation advisory service covers deployment model selection, vendor evaluation, and governance framework design as a structured engagement.
The deployment decision is the production gap — and the hybrid model closes it for most enterprises.
The 12-dimension matrix confirms what the production statistics suggest: Hybrid wins on the dimensions that matter most for compliance-bound enterprises — data residency, governance stacking, compliance flexibility, and scale ceiling. But Hybrid's 4.2/5 weighted average does not mean it is the right answer for every organization. The 16-node terminal recommendation tree makes clear that SaaS remains the correct answer for teams under 50 engineers without a compliance trigger, and that self-hosted remains the correct answer for coding-agent use cases at scale where BYOK and customization outweigh time-to-value constraints.
The projection for the remainder of 2026 is that the Hybrid category will consolidate further. Agentforce 360 for AWS will reportedly expand its sovereign-region coverage. LangSmith Deployment's hybrid tier will mature as more organizations move from LangGraph prototypes to production. The EU AI Act enforcement date of August 2, 2026 (subject to Omnibus amendments) will force a wave of compliance-driven deployment model upgrades across European organizations that have been operating in the SaaS tier. The organizations that complete their deployment model decision before that enforcement date will be positioned to benefit from the AI productivity gains the Forrester TEI data suggests; the organizations that defer will face the enforcement deadline with pilots still in progress. The matrix is the instrument for making that decision in weeks rather than months.