SYS/2026.Q1Agentic SEO audits delivered in 72 hoursSee how →
BusinessFramework16 min readPublished May 15, 2026

Region pinning, prompt redaction at the edge, sovereign cloud overlays — the architecture patterns that turn data residency from blocker to feature.

AI Data Residency: Architecture Patterns + Compliance 2026

Region pinning, prompt redaction at the edge, sovereign cloud overlays — the architecture patterns that turn data residency from a deployment blocker into a customer-facing feature. Vendor regional support, in-region retrieval, and observability without residency violations.

DA
Digital Applied Team
Senior architects · Published May 15, 2026
PublishedMay 15, 2026
Read time16 min
Patterns6 architectural
Architecture patterns
6
covered end-to-end
Sovereign cloud regions
4
EU · UK · APAC · GCC
Compliance frameworks
5
GDPR · UK DPA · PIPL · PDPL · AI Act
Vendor regional support
Varies
verify per-model per-region

AI data residency has moved from a checkbox on a procurement form to an architectural axis that determines whether a system ships at all. The patterns that work in 2026 — region-pinned inference, edge prompt redaction, in-region retrieval, sovereign cloud overlays — are not bolt-ons. They are design decisions made before the first prompt is written.

What changed is the surface area. A typical generative AI application now spans a model API, a vector store, an observability backend, an evaluation harness, a prompt-cache layer, and a feedback loop into fine-tuning or preference data. Every one of those surfaces is a potential cross-border transfer. Residency arguments that focus only on "where the model runs" miss the other five places customer data leaves the region.

This guide covers the six architecture patterns we use to keep AI workloads in-region across EU, UK, APAC, and GCC deployments — the vendor regional pickers worth knowing, the redaction patterns that hold up at the edge, the retrieval and observability designs that respect residency without crippling product quality, and how the patterns map onto GDPR, UK DPA, PIPL, PDPL, and the EU AI Act.

Key takeaways
  1. 01
    Data leaves the region in invisible places.Inference is only one of six surfaces. Telemetry, evaluation, prompt caching, fine-tuning feedback, and observability all routinely cross borders unless explicitly pinned. Residency reviews that audit only the model API miss the other five.
  2. 02
    Region-pinning is necessary but not sufficient.Bedrock, Vertex AI, and Azure OpenAI all offer regional pickers, but support per-model and per-feature varies by region. Pin the region, then verify the specific model variant and feature set are actually available in that region before committing.
  3. 03
    Edge redaction is the cheapest residency control.Stripping PII and customer identifiers at the edge — before the prompt crosses into a model API region — eliminates an entire class of residency questions. The redaction can run in a CDN worker or a regional API gateway in tens of milliseconds.
  4. 04
    Sovereign overlays are required for some sectors.Health, defense, and certain public-sector workloads cannot use commercial multi-tenant regions even when the region is geographically correct. Sovereign cloud variants (AWS European Sovereign Cloud, Microsoft Cloud for Sovereignty, GCC sovereign zones) exist for these cases.
  5. 05
    Observability must respect residency too.Cross-region evaluation, prompt logging, and model-output sampling are the silent residency violators. Design observability with regional log sinks, federated dashboards, and redacted aggregation — not a central US bucket collecting every interaction.

01Why ResidencyData leaves the region in invisible places.

The residency conversation usually starts in the wrong place. A procurement reviewer asks "where does the model run?" The architect answers "eu-west-1." The checkbox is ticked, the project ships, and six months later an internal audit discovers that prompt logs are streaming to a US-based observability backend, evaluation samples are being graded by a model in another region, and the prompt cache vendor stores tenant-tagged inputs in a global key store. None of these were intentional decisions. They were defaults nobody noticed.

The shape of a modern generative AI application is the problem. Inference is one of six surfaces where customer data routinely crosses regions:

  • The model API itself. Inputs, outputs, and any system prompt context — the obvious one.
  • Retrieval. The vector store, the re-ranker, and any external search APIs the agent calls.
  • Observability. Prompt logs, output logs, tool-call traces, eval samples, and the model-graded quality scores that get computed on those samples.
  • Prompt caching.The Anthropic-style cache key, OpenAI's prompt cache index, third-party caching proxies — all of them tag-and-store inputs by default.
  • Feedback & fine-tuning. Thumbs-up / thumbs-down data, prompt-completion pairs collected for training, DPO datasets, and any data sent to a fine-tuning API.
  • Vendor support telemetry.The diagnostic data the model provider may collect for abuse monitoring, service health, or safety review — governed by the provider's own data processing addendum.
The reframe that matters
Residency is not "where the model runs" — it is every place a customer-derived byte rests, even briefly, across every component of the system. Architecting for residency means designing each of the six surfaces above with a deliberate regional answer, not inheriting whatever default the vendor ships.

The business case for getting this right is no longer abstract. EU customers ask for documented regional commitments on day one of a procurement cycle. UK public sector tenders increasingly require a UK-only data path. Saudi PDPL enforcement has accelerated through 2025 and 2026. China PIPL cross-border transfer rules now require either a standard contract filing or a security assessment for most outbound transfers of personal information. Residency is sales enablement; failing it disqualifies entire pipelines.

For a deeper view of how the EU AI Act lays sector-specific residency expectations on top of GDPR, see our companion piece on the EU AI Act compliance checklist by risk tier — it covers the documentation, audit, and provider-vs-deployer split that determines where residency obligations actually land in your stack.

"The residency surface is six places wide. Architectures that audit only the model API ship with five blind spots."— A pattern from our 2026 AI architecture reviews

02Region PinningBedrock, Vertex AI, Azure OpenAI regional pickers.

Region pinning is the foundation pattern: the model call executes in a specific regional endpoint and the provider commits — contractually and technically — that input and output bytes stay in that region for the duration of inference. Every major hyperscaler offers some form of this in 2026, but the support matrix differs per provider, per model, and per feature.

The three serious regional offerings

AWS Bedrock exposes a regional endpoint per model family. Inference runs in the region of the endpoint you call, and AWS commits that prompt and completion data are not used for service improvement and do not leave the region in a service-default deployment. Cross-region inference is opt-in (the Cross-Region Inference feature), disabled by default in regulated accounts. Model availability per region varies — Claude in EU and AP regions lags US availability by a quarter or two, and not every Anthropic or Mistral variant ships in every region.

Google Vertex AI offers regional endpoints for Gemini models with explicit data residency commitments per region. The console exposes a region picker per model family; the API surface uses a per-region endpoint hostname. Google publishes a regional availability table that lists which Gemini variants are GA, preview, or unavailable per region — verify before committing because preview-only availability often carries different residency guarantees than GA.

Azure OpenAI runs OpenAI models inside Azure regions with the strongest regional posture of the three for European deployments — the EU Data Boundary commitment covers Azure OpenAI for most data types. Model availability per Azure region is the tightest constraint; GPT-4-class and GPT-5-class models ship to a smaller set of regions than smaller variants. Provisioned Throughput Units (PTUs) are region-specific and require regional capacity planning.

AWS Bedrock
Regional endpoints, opt-in cross-region

Per-region endpoints with default in-region commitment. Cross-Region Inference is opt-in and can be disabled at the account or service-control-policy level. Model availability lags US in EU and AP regions; verify the exact variant before locking architecture.

Pick Bedrock for AWS-native stacks
Google Vertex AI
Per-region endpoints, published residency table

Regional endpoints for Gemini with explicit residency commitments per region. Console region picker, per-region API hostname. GA vs preview availability differs — preview features sometimes carry weaker residency guarantees than GA.

Pick Vertex AI for GCP stacks
Azure OpenAI
EU Data Boundary coverage

Strongest European posture of the three — Azure OpenAI is covered by the EU Data Boundary commitment for most data types. Tightest constraint is per-region model availability; GPT-4 and GPT-5 class models ship to a smaller set of regions than smaller variants.

Pick Azure OpenAI for EU deployments
Direct model providers
OpenAI, Anthropic, Mistral direct APIs

Direct APIs typically default to a global routing layer. OpenAI offers EU residency add-ons; Anthropic publishes a data processing addendum but the API itself does not expose regional endpoint selection in the same way hyperscalers do. Verify per-provider; do not assume.

Verify regional posture explicitly

The non-obvious failure mode is partial coverage. A team pins inference to eu-west-1 and considers residency handled — but the Bedrock guardrails feature, the prompt caching feature, or the model evaluation feature may run in a different region or be globally pooled. Read the per-feature regional documentation, not just the per-endpoint commitment. Hyperscalers are explicit when a specific feature is excluded from the regional commitment; architects who skip that page ship with a hole.

What a clean region-pinning policy looks like

Codify three things in the platform layer, not in per-service config: (1) the allowed regional endpoints for each model family, enforced via SDK initialization or a service-control policy that denies calls to disallowed regions; (2) the per-feature exclusion list — the cache, eval, guardrail, or auxiliary services that are not covered by the regional commitment, surfaced in code as explicit opt-in flags; (3) the failure mode when a region runs out of capacity — fail closed (refuse the request) rather than failing open to another region.

The pattern that holds up
Region pinning is a platform-level control, not an application-level one. Enforce it in the SDK factory and in the cloud account's service-control policies — never trust per-service config to remain in policy across hundreds of deployments.

03Edge RedactionStrip PII before cross-region transit.

Edge redaction is the cheapest residency control we deploy, and it sits one layer above region pinning. The idea is simple: before a prompt crosses any regional boundary — even into a properly pinned model endpoint — strip the customer identifiers, PII, and sensitive entities out of the payload at the edge, replace them with stable placeholder tokens, and re-hydrate after the response returns. If the redaction is correct, no customer-identifying byte ever crosses a region, which collapses entire categories of residency review.

Where the redaction runs

Three deployment points work in practice. The first is a CDN worker — Cloudflare Workers, Vercel Edge Middleware, or AWS CloudFront Functions — running in the same region as the user. The second is a regional API gateway sitting in front of the model API, which gives more compute headroom and easier integration with secret stores. The third is the application layer itself, where the redaction runs in the API route before the model call; this is simplest to implement but harder to verify independently in audit because the same code that needs access to PII is also responsible for redacting it.

What gets redacted

A workable taxonomy: (1) direct identifiers — names, email addresses, phone numbers, government IDs; (2) account identifiers — customer IDs, account numbers, transaction IDs that could re-identify the user; (3) free-text PII the user typed into the prompt itself (addresses, employer names, health terms); (4) attachment content for documents and images, redacted via a separate document-redaction pipeline before the attachment is uploaded.

The redaction is replacement, not deletion. Names become [PERSON_A], account numbers become [ACCOUNT_1], addresses become [LOCATION_3]. The mapping from placeholder to real value lives in a regional key-value store; the model call goes out with placeholders only; the response comes back with placeholders that the edge worker rehydrates before returning to the client. Per-session mapping prevents cross-prompt re-identification by a curious operator with log access.

Edge
CDN worker redaction
Cloudflare / Vercel / CloudFront

Runs in the same region as the user. Lowest latency. Smallest blast radius — the worker only sees one request at a time. Harder to update redaction rules in lockstep across providers. Right default for low-complexity redaction.

Lowest latency
Gateway
Regional API gateway
Kong / Envoy / cloud-native gateway

Sits in front of the model API in the same region. More compute headroom. Cleaner integration with secret stores, audit logs, and rate limiting. The production default for medium and high complexity redaction.

Production default
Application
In-app pre-call hook
API route middleware

Simplest to implement — the redaction runs in the same code path as the model call. Harder to verify in audit because the redacting code also has the original PII. Useful as a starting point; migrate to gateway-level for serious deployments.

Easiest to ship

The redactor itself should be deterministic at the entity level — the same email address in the same session always maps to the same placeholder — and stable across reasonable spelling variants. For the entity detector, a small NER model running locally in the worker (Presidio, a distilled spaCy variant, or a regex-plus-validator stack for structured identifiers) is the typical implementation. LLM-based redaction is feasible but pushes the cost and latency profile in the wrong direction; reserve LLM redaction for the long tail of unstructured PII the classical NER misses.

The verification step that matters

Edge redaction without an audit trail is theater. Every redacted call should write three records to the regional log sink: the placeholder-to-entity mapping (encrypted at rest, short retention), the redacted prompt that was sent to the model, and the redacted response that came back. Regular automated sampling of the redacted prompts — a second classifier looking for missed PII — catches redactor drift before it shows up in a compliance audit.

"If no customer-identifying byte ever crosses the region, residency questions about that surface stop being a debate and start being a measurement."— Edge redaction as a residency primitive

04In-Region RetrievalRegional vector stores, replication patterns.

Retrieval is the second-largest residency surface after inference and the one most teams skip. A vector store holds the embedded representation of customer content, often in a database hosted in a single region of the vendor's choice. If the retrieval call crosses the region, every customer document the agent grounds on crosses the region with it.

Three retrieval architectures hold up in 2026 depending on the residency posture required. The patterns differ on cost, on operational complexity, and on what they assume about the workload.

Single-region
One primary per tenant
tenant → assigned region

Each tenant is provisioned in exactly one region — eu-west, us-east, ap-southeast. Retrieval calls always target that region. Simplest mental model. Lowest cost. Multi-region customers must pick a primary; cross-region read latency is the trade.

Simplest · Lowest cost
Replicated read
Per-region read replicas
writes → primary · reads → local

Writes go to a primary region (often the tenant's home region); reads go to a per-region replica. Each replica is a residency unit. Embedding regeneration runs in the primary; the replicated index is read-only in regions outside the primary.

Production default
Sharded
Per-region isolated shards
no cross-region replication

Each region holds the embeddings for tenants assigned to it; no replication crosses regions. The strictest residency posture. Operational complexity is highest — the retrieval router must know which region a tenant lives in, and migrations between regions are explicit data-movement events.

Strictest posture

The vendor surface has improved. pgvector running in managed Postgres (RDS, Cloud SQL, Azure Database for PostgreSQL) inherits the regional commitments of the underlying database service — pin the database to a region and the vectors stay there. Dedicated vector DBs (Pinecone, Qdrant Cloud, Weaviate Cloud) all now offer regional deployment, but with varying degrees of regional model-evaluation telemetry that may or may not cross regions — verify the residency commitment for the operational plane, not just the data plane.

For teams architecting a Postgres-backed RAG layer inside this framework, our self-hosted RAG with pgvector tutorial walks through the schema, ingestion, and retrieval patterns — running pgvector inside a regional managed Postgres is one of the cleanest residency stories available.

The re-ranker question

Cross-encoder re-rankers (Cohere Rerank, Voyage Rerank) are a separate residency consideration. Cohere offers regional endpoints for Rerank; Voyage's residency posture is less mature as of 2026. A local re-ranker — a small cross-encoder running inside the same regional container as the retrieval service — sidesteps the question entirely at modest quality cost. Measure the recall difference on your own corpus before deciding whether the residency complexity of an external re-ranker is worth the quality lift.

The retrieval residency anti-pattern
Embedding-generation calls that run in a different region from the vector store. A US-based embedding API ingesting EU-tenant documents means the document text crosses the region during ingestion, even if the resulting vector is stored in the EU. Pin both the embedding endpoint and the vector store to the same region, or run a regional self-hosted embedding model.

05Sovereign OverlaysEU, UK, APAC, GCC deployments.

Region pinning solves geographic residency. It does not solve sovereignty. For health, defense, certain financial-services workloads, and most public-sector workloads in regulated jurisdictions, the requirement extends beyond "the bytes are in the right country" to "the operational control plane is also bounded by local jurisdiction." That means no foreign-controlled support engineer with a path to customer data, no telemetry path into a parent company in a different jurisdiction, and often a separate legal entity operating the infrastructure under local law.

The sovereign-overlay landscape has matured significantly through 2025 and 2026. Four serious offerings are worth knowing.

EU sovereign
AWS
European Sovereign Cloud

Operated by AWS subsidiaries inside the EU under EU jurisdiction. Personnel access bounded to EU residents. Independent from the wider AWS commercial cloud at the operational plane. Phased GA through 2026 with limited initial service set; verify which AI services are in scope.

AWS · EU residents only
Microsoft
EU+
Cloud for Sovereignty

Configurable sovereignty controls on top of standard Azure regions — data localization, sovereign landing zones, and customer lockbox controls. EU Data Boundary commitment applies. Available globally with regional sovereignty configurations including national-cloud variants.

Azure · global with overlays
Google
EU
Sovereign Controls

Available as sovereign-controlled regions in partnership with local operators (T-Systems for Germany, Thales for France). Customer-managed encryption keys, access transparency, and operational sovereignty controls. Vertex AI coverage varies by partner region.

GCP · partner-operated
GCC
KSA
PDPL-aligned sovereign zones

Saudi Arabia, UAE, and broader GCC sovereign zones offered by hyperscalers and local operators. PDPL enforcement and explicit data-localization requirements drive demand. Verify which AI services and model families are GA in the relevant zone before committing.

Hyperscaler + local operator

The architecture pattern that makes sovereign overlays work without forking the entire codebase is environment parity with regional substitution. The application is built against a service-locator abstraction — "the model client," "the vector store," "the observability sink" — and each environment binds those interfaces to the sovereign-appropriate concrete implementation. The same container image runs in commercial and sovereign environments; only the environment configuration differs.

Two practical constraints to plan for. First, service availability lag: sovereign overlays consistently lag commercial regions by a quarter or two on new AI service launches. The product surface available in sovereign should be a documented subset of the commercial surface, with explicit graceful-degradation for features that haven't shipped to the sovereign zone yet. Second, vendor support reach: a customer support engineer in a sovereign zone may not have a path to engineering in the parent organization for incident triage. Plan for longer incident MTTR in sovereign environments and document the support boundary clearly for end customers.

When sovereignty is the requirement
Treat sovereign overlay as a separate environment, not a configuration flag on commercial. Different operational ownership, different incident response, different release cadence, different feature surface — same application code, behind a service-locator abstraction that isolates the regional bindings.

06ObservabilityCross-region eval without residency violations.

Observability is where residency programs quietly fail. The model API runs in eu-west, the vector store runs in eu-west, the edge worker redacts at the edge — and the log sink, the eval pipeline, and the engineering dashboard all live in us-east. Every successful inference writes a record to a US bucket. The residency commitment is broken on the most-trafficked path in the system, and nobody notices until the auditor pulls the log access paths.

The pattern that holds up is federated observability: regional log sinks per region, regional aggregation, and a central dashboard that queries the regional sinks federatively rather than centralizing the underlying records. The engineering team sees a global view; no individual customer record ever leaves its region.

Observability designs · residency posture

Residency posture of common observability designs
Centralized US bucketAll regions stream to one US-hosted sink
Violates
Regional sinks · cross-region dashboardPer-region storage; dashboard reads cross-region
Partial
Federated read · redacted aggregationRegional sinks; aggregated metrics only cross region
Clean
Fully isolated per regionNo cross-region read; regional dashboards only
Strictest

What writes where

A workable rule: anything that contains, or could be joined to, a customer-identifying token writes to the regional sink only. Aggregated metrics — counts, latencies, error rates, and model-grade quality scores computed inside the region — can cross to the central dashboard. Eval samples (prompt/response pairs used for quality measurement) stay in the region; the model-grader runs in the same region; only the score crosses to the dashboard.

The eval pipeline trap

The single most common residency violation in mature AI programs is the offline eval pipeline. A central team samples 1% of production traffic to grade quality offline, the sampler is centralized, and the eval samples — which are full prompt-completion records — stream cross-region into the eval team's working environment. Sometimes the eval samples are passed to a different model (often a more capable one in a different region) as the grading judge. Sometimes the eval results are exported into a notebook environment for analysis. Every one of those steps is a separate residency decision.

The right pattern: the sampler runs in-region, the grader model runs in-region (typically a smaller variant of the same family that runs production), the eval scores cross to the central dashboard, the underlying samples never do. If a human reviewer needs to read specific failed samples, they read them via a regional UI that fetches from the regional sink — the sample never crosses the region just because an engineer is looking at it.

Observability test that finds violations
For each of the six surfaces, ask: "where is this record written? Where is it read? Who has the credentials to read it from outside the region?" The third question is the one most programs skip. Cross-region IAM grants on regional log buckets are de facto cross-region transfer paths, even if no automated pipeline currently uses them.

07Compliance MapGDPR, UK DPA, PIPL, PDPL, AI Act.

Architecture patterns matter because compliance frameworks attach to them. The five frameworks most teams encounter in 2026 each lay slightly different obligations on the same architectural surfaces — and each has been updated or actively enforced through 2025 and 2026 in ways that shift how the architecture has to be designed.

GDPR
EU General Data Protection Regulation

Article 44-49 governs cross-border transfers. Adequacy decisions, Standard Contractual Clauses (SCCs), or derogations remain the legal mechanisms. EU-US Data Privacy Framework provides adequacy for certified US recipients. Residency architecture mostly side-steps SCC complexity by keeping data in-region.

Pin EU regions; redact at edge for any US-bound traffic
UK DPA
UK Data Protection Act + UK GDPR

Mirrors GDPR post-Brexit with UK-specific adequacy decisions and the UK International Data Transfer Agreement (IDTA) replacing SCCs for UK exports. UK-only data path is a frequent public-sector tender requirement; a separate UK region (not just an EU region) may be needed.

Pin UK regions for public sector; document UK-only path
China PIPL
China Personal Information Protection Law

Cross-border transfer of personal information requires standard contract filing, CAC security assessment, or certification depending on volume and category. Most foreign AI services cannot serve Chinese personal data directly without a local presence and registered model service.

Operate via local Chinese cloud or do not serve PRC data
Saudi PDPL
Saudi Personal Data Protection Law

Enforcement accelerated through 2025-2026. Cross-border transfers permitted under specific conditions including adequacy, contractual safeguards, or controller authorization. GCC sovereign cloud zones are the practical architecture answer for regulated PDPL workloads.

Use GCC sovereign zone for regulated KSA workloads
EU AI Act
Risk-tier obligations on top of GDPR

Does not by itself impose new residency rules but adds documentation, transparency, and post-market monitoring obligations that often imply in-region operational logging. High-risk and general-purpose AI obligations interact with residency through the audit-trail requirement. See our compliance-by-risk-tier checklist.

Layer AI Act docs on top of residency architecture

The simplification that helps architects: the architectural patterns covered in sections 02-06 — region pinning, edge redaction, in-region retrieval, sovereign overlay, federated observability — are framework-agnostic. They satisfy the technical requirements that every framework above turns into a different legal question. Build the architecture first; the legal mapping is a documentation exercise on top of an architecture that already keeps data where it needs to be.

For teams operating across multiple frameworks simultaneously — which is most cross-border SaaS in 2026 — the right organizing principle is strictest-framework-wins per region. If a region serves customers covered by PDPL and GDPR, architect to PDPL's controls (which are typically stricter on operational sovereignty), and GDPR compliance follows. The cost of architecting to the strictest framework is modest at design time and substantial savings at audit time.

For end-to-end work mapping these residency patterns into a specific application — from inference layer to observability — our AI digital transformation engagements start with exactly this kind of regional and sovereignty map for the system in question.

Conclusion

Data residency is architectural — design for it, don't bolt it on.

The teams that ship AI in regulated sectors in 2026 have stopped treating residency as a checkbox at the end of procurement and started treating it as an architectural axis at the start of design. Region pinning at the platform layer. Edge redaction in front of any cross-region transit. Regional retrieval with deliberate replication patterns. Sovereign overlays as a separate environment, not a configuration flag. Federated observability that keeps customer records in-region while still giving engineering a global view. None of these are new primitives. The shift is using them together, by default, on day one of the architecture.

The honest framing is that residency is no longer a technical inconvenience — it is a product feature. Customers ask about it before signing. Procurement teams disqualify vendors that cannot answer it cleanly. Public sector tenders increasingly require it as a baseline. Architectures that treat residency as a first-class design concern open pipelines that residency-naive architectures cannot enter. That is not a compliance argument; it is a revenue argument with a compliance tail.

The work in the next eighteen months is mostly consolidation. The hyperscaler regional matrices are still uneven. Sovereign overlays still lag commercial regions on AI service availability. Direct model providers' residency postures are still maturing. Architects who design against the patterns above — documented in six places, audited against three questions, layered against five compliance frameworks — ship systems that absorb that volatility without re-architecting every quarter. That is the practical definition of designing for residency rather than bolting it on.

Architect for residency

Data residency is architectural — design for it, don't bolt it on.

Our team designs and operates region-pinned AI architectures — inference, retrieval, observability, audit trails — for EU, UK, APAC, and GCC deployments.

Free consultationExpert guidanceTailored solutions
What we architect

Residency-first AI engagements

  • Region-pinned inference architecture
  • Edge redaction pipeline implementation
  • In-region retrieval with vector-store replication
  • Sovereign overlay deployment
  • Cross-region observability with residency discipline
FAQ · Data residency

The questions architects ask before cross-region rollout.

All three major hyperscaler AI surfaces — AWS Bedrock, Google Vertex AI, and Azure OpenAI — support regional endpoints in 2026. Azure OpenAI has the strongest European posture thanks to the Microsoft EU Data Boundary commitment, which covers Azure OpenAI for most data categories. Bedrock and Vertex AI both offer explicit regional endpoints with in-region commitments but with varying model availability per region (EU and AP regions typically lag US by a quarter or two on new model variants). Direct model APIs (OpenAI, Anthropic, Mistral) typically default to global routing and require explicit add-ons or contractual commitments for regional residency — verify per-provider rather than assuming.