Grok 4.3 on Amazon Bedrock went generally available on June 15, 2026 — the first time an xAI model has been offered through the platform, making xAI the third major independent lab there alongside Anthropic and OpenAI. The launch reads like a straightforward catalog addition. It is not.

At an on-demand rate of $1.25 input and $2.50 output per million tokens, Grok 4.3 is the cheapest US-lab frontier reasoning model on Bedrock by a wide margin. But three things sit just under that headline that most coverage skips: a Mantle endpoint that breaks standard Bedrock SDK code, a context-window pricing structure that doubles costs above 200,000 tokens, and a vendor — xAI — in the middle of a SpaceX restructuring with most of its founding team gone.

This guide covers exactly what landed on June 15, how the pricing actually works once you account for long context, the practical integration gotcha that no other coverage surfaces clearly, an honest benchmark read by enterprise domain, and the governance questions a regulated-sector team has to answer before deploying. Every number below is sourced and dated; where a figure is vendor- stated or carries a caveat, we say so.

Key takeaways

01
Grok 4.3 is live on Bedrock as of June 15, 2026.xAI becomes the third independent lab on the platform, behind Anthropic and OpenAI. The Bedrock launch arrived roughly six weeks after the direct xAI API release on April 30, 2026.
02
It is the cheapest US-lab frontier reasoning model on Bedrock.On-demand pricing is $1.25 input and $2.50 output per million tokens, with cached input at $0.20 — undercutting Claude Sonnet 4.6 ($3 / $15) and the Bedrock GPT tier on both sides of the bill.
03
The 1M context window doubles in price above 200K tokens.The headline million-token window is real, but requests over 200,000 tokens are billed at the higher context tier. For long-document workloads the effective cost can land well above the sticker rate.
04
Grok on Bedrock uses Mantle, not bedrock-runtime.It runs on a new inference engine via the bedrock-mantle endpoint with an OpenAI-compatible path — not the Converse or InvokeModel APIs. Existing Bedrock SDK code will not work unchanged.
05
Strong benchmarks, real governance questions.Grok 4.3 posts competitive agentic and domain scores, but factual-accuracy gains came alongside a non-hallucination regression, and the vendor is mid-restructuring with 9 of 11 co-founders departed.

01 — What LaunchedxAI joins Bedrock, six weeks after the API.

Grok 4.3 was not new on June 15. The model was first released on the direct xAI API on April 17, 2026, with the API default switched to grok-4.3 on April 30. The Bedrock general availability announcement came roughly six weeks later, which matters: the model, its benchmark profile, and its pricing were all established before AWS added it to the catalog. The Bedrock story is about distribution and enterprise rails, not a new capability tier.

On Bedrock the model ID is xai.grok-4.3. It ships with a 1-million-token context window and a maximum output of 30,000 tokens. Input modalities are text and image; audio, speech, and video are not supported, and output is text only. Reasoning is always on by default and cannot be fully switched off — it is configurable through a reasoning effort parameter with four levels (none, low, medium, high), where none suppresses reasoning tokens from the output rather than stopping the internal process.

Reasoning-first

Grok-4.3 on Bedrock

xai.grok-4.3 · 1M context · 30K max output

Text + image input, text output. Always-on, configurable reasoning effort (none / low / medium / high). Service tiers: Standard, Priority, and Flex — Reserved is not supported.

GA June 15, 2026

Regions

In-Region only at launch

us-west-2 · us-east-1 · us-east-2

Oregon, N. Virginia, and Ohio at launch. Geo Cross-Region and Global Cross-Region inference are not supported, so a multi-region failover plan must account for the limited footprint.

verify current regions

Multi-cloud context

Grok's cloud rollout has been steady: Oracle Cloud Infrastructure (June 2025, its first third-party cloud), Microsoft Azure AI Foundry (September 2025), then Amazon Bedrock on June 15, 2026. A separate Databricks Agent Bricks integration followed three days later, on June 18 — these are distinct announcements, not a single Bedrock event. (Source: Memeburn, June 2026, citing the xAI Bedrock announcement that returned an access error on direct fetch; treat exact dates as verifiable via the AWS listing.)

02 — PricingThe cheapest US-lab reasoning model on Bedrock.

The pricing is the reason most teams will look at this launch at all. On-demand Bedrock rates are $1.25 per million input tokens and $2.50 per million output tokens, with cached input at $0.20 per million. This is consistent with the rate on the direct xAI API. Against the other US-lab frontier options on Bedrock, that is the lowest sticker price on both sides of the bill — though, as always with Bedrock, verify the current numbers on the AWS pricing page before you budget against them.

Relative to its own predecessor, Grok 4.20, this is a large cut. Independent benchmark coverage puts it at roughly a 40% reduction in input cost and a 60% reduction in output cost versus the prior generation. (Note the version numbering: xAI went from 4.1 to 4.20 — there is no Grok 4.2.)

Grok 4.3 vs Grok 4.20 · per-million-token pricing

Source: AWS Bedrock model card; reduction figures via Artificial Analysis · bars relative to prior-gen Grok 4.20

Grok 4.20 output (prior gen)~$6.00 per 1M output tokens

~$6.00

Grok 4.3 output$2.50 per 1M output tokens

$2.50

Grok 4.20 input (prior gen)~$2.00 per 1M input tokens

~$2.00

Grok 4.3 input$1.25 per 1M input tokens

$1.25

Grok 4.3 cached input$0.20 per 1M tokens

$0.20

One cost factor offsets part of the per-token saving: higher-reasoning Grok 4.3 reportedly emits roughly 44% more output tokens than Grok 4.20 on comparable tasks. At high request volumes, more output tokens at a lower per-token rate can still net out cheaper — but the saving is smaller than the headline cut suggests, and it depends on your workload mix. Run the math on your own traffic, not on the sticker price.

A note on the extra fees

On the direct xAI API, two cost lines sit outside the per-token rate: a small reported fee for requests blocked by safety filters before generation, and per-call fees for tool use such as web search, code execution, and file attachments. These apply on the xAI direct API — whether the same structure carries over to the Bedrock Mantle path is not something we can confirm from the AWS model card, so do not assume it in a Bedrock cost model until AWS documents it. Verify before you commit a budget.

03 — The Context CliffThe 1M window doubles in price above 200K.

This is the single most important caveat in the whole launch, and it is buried. The 1-million-token context window is genuinely available, but pricing is not flat across it. Requests that exceed 200,000 total tokens are billed at a higher context tier, where the per-token rate doubles. For the long-document workloads that a million-token window is supposed to enable — contract sets, case-law corpora, full financial filings — the effective cost can be meaningfully above the sticker rate.

Treat the $1.25 / $2.50 numbers as the price for everything under 200K tokens. The moment a request crosses that line, model your costs at the higher tier. A long-document RAG pipeline that routinely assembles 400K–800K-token prompts is not running at the headline rate; it is running at the doubled one. The marketing claim and the invoice live in two different pricing regimes.

Headline window

Context window

The full million-token window is real and available on Bedrock — useful for whole-codebase analysis, multi-document reasoning, and long financial or legal corpora.

max output 30K tokens

Pricing cliff

Where the rate doubles

200K

Requests above 200,000 total tokens are billed at the higher context tier. Long-document prompts that routinely exceed this run at roughly double the headline per-token cost.

model costs at 2x

Output ceiling

Max output tokens

30K

Output is capped at 30,000 tokens per request. Reasoning-heavy responses with always-on reasoning can consume meaningful output budget, so size jobs accordingly.

always-on reasoning

"Grok 4.3 is as smart as Sonnet 4.6 and 5x cheaper and faster."— Bindu Reddy, CEO of Abacus AI, on X, May 1, 2026 (market sentiment, not an independent benchmark)

04 — Integration GotchaMantle, not bedrock-runtime.

Here is the practical surprise that breaks copy-paste integrations. Grok 4.3 on Bedrock does not run on the standard bedrock-runtime endpoint that Claude, Amazon Titan, and other Bedrock models use. It runs on Mantle, a new inference engine inside Bedrock built for price-performance, reached through the bedrock-mantle endpoint. It does not support the Converse API or InvokeModel.

Instead, Mantle is OpenAI-compatible: it exposes an openai/v1 path, so developers already using OpenAI SDKs can point them at the Mantle endpoint and migrate with relatively small code changes. The endpoint format is https://bedrock-mantle.{region}.api.aws/openai/v1. The catch for teams standardized on Bedrock's own SDK is real: existing Converse or InvokeModel code will not call Grok unchanged.

Two further migration details matter. The compatibility is on the openai/v1/responses path rather than the classic Chat Completions path, so describe it as OpenAI-compatible, not fully interchangeable. And the Bedrock defaults differ from the OpenAI API defaults: temperature defaults to 0.7 (not 1), top_p to 0.95 (not 1), and max_completion_tokens to 131,072. Set these explicitly when porting OpenAI SDK code, or your outputs will drift from what the same code produced against OpenAI.

Migration blocker check

If your platform is already standardized on the Bedrock Converse API or InvokeModel, adding Grok 4.3 is not a config change — it is a separate integration path through the Mantle endpoint with an OpenAI-compatible client. Scope that work before you promise Grok in a roadmap. This is the gotcha most launch coverage glosses over.

If you want a reference point for what good agentic tooling around Grok looks like outside Bedrock, xAI's own Grok CLI and parallel agent tooling is a useful companion read on how the model is meant to be driven for tool-calling and multi-agent work.

05 — BenchmarksStrong on agents, an honest accuracy caveat.

Grok 4.3 benchmarks well on the workloads AWS is targeting — customer support, legal research, financial document analysis — and it makes a clear generational jump on agentic tasks. But the accuracy picture is mixed in a way that matters for regulated sectors, and it is worth stating plainly rather than cherry-picking.

On the Artificial Analysis Intelligence Index, the headline depends on reasoning effort: 53 at high effort and roughly 38 at low effort — never a single bare number, because the gap between the two settings is the difference between competitive and middling. At high effort the 53 places it above the score for the previous Grok 4.20 release; cite the effort level whenever you quote it.

The accuracy trade-off you should not skip

Per Artificial Analysis, Grok 4.3 gained about 8 points on the AA-Omniscience factual-accuracy benchmark over Grok 4.20 — but lost about 8 points on non-hallucination rate over the same comparison. More right answers, but also more confident wrong ones. For finance, healthcare, and legal workflows where a fabricated citation is a liability, that regression is the number to weigh against the marketing language. Do not describe Grok 4.3 as best-in-class on hallucination — the trend on that specific metric is negative.

Grok 4.3 by enterprise domain · selected benchmarks

Source: Artificial Analysis; Vals AI scores via VentureBeat (vendor-adjacent) · agentic bars indexed for display

Tau2-Bench TelecomCustomer-support tool calling · +8 pts vs Grok 4.20

98%

+8 pts

Vals AI CaseLaw v2Legal research · #1 ranked (vendor-adjacent source)

79.3%

#1 ranked

IFBenchInstruction following

81%

agentic

GDPval-AA (agentic ELO)1,500 vs Grok 4.20's 1,179 · trails GPT-5.5 xhigh

1,500

+321 ELO

AA Intelligence Index (high)High reasoning effort · ~38 at low effort

effort-dependent

Agentic / tool-use strengthKnowledge / reasoning index

The agentic story is the genuinely strong one. The GDPval-AA agentic ELO of 1,500 is a 321-point jump over Grok 4.20's 1,179, and it surpasses several named competitors on that benchmark — though it still trails GPT-5.5 (xhigh) by a wide margin, with an expected win rate well under a coin flip. So the right framing is not best-in-class; it is markedly better than its predecessor and competitive on a price-adjusted basis. For a full top-tier view of where it sits, our frontier model comparison of Claude Opus 4.8 and GPT-5.5 sets the ceiling Grok 4.3 is measured against.

"The energy drink of frontier models: it'll keep you up, but you won't enjoy the experience and you'll regret it in the morning."— Corey Quinn, cloud cost analyst, The Register, May 29, 2026

06 — Bedrock LineupWhere Grok 4.3 sits in the Bedrock catalog.

The table below maps Grok 4.3 against the other frontier and efficient options a Bedrock team is likely to weigh, on the two axes that drive a model-selection decision: price and the endpoint you integrate against. The endpoint-type column is the one no published comparison includes, and it is the one that determines how much integration work a switch actually costs. Bedrock pricing moves often — verify every figure on the AWS pricing page before you commit.

Comparison of Grok 4.3 against other Bedrock frontier and efficient models across provider, input and output price per million tokens, context window, and Bedrock endpoint type.
Model	Provider	Input $/M	Output $/M	Context	Endpoint
Grok 4.3	xAI	$1.25	$2.50	1M	Mantle
Amazon Nova Pro	Amazon	$0.80	$3.20	300K	Runtime
DeepSeek V3.2	DeepSeek	$0.62	$1.85	128K	Runtime
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	200K	Runtime
Claude Opus 4.7	Anthropic	$5.00	$25.00	200K	Runtime
GPT-5.4 (via Bedrock)	OpenAI	~$2.50	~$15.00	128K	Runtime

Read across one row and the positioning is clear. Grok 4.3 is the only frontier-class reasoning model in this set under $1.50 input, and the only one with a million-token window — but it is also the only one on a non-standard endpoint. Amazon Nova Pro and DeepSeek V3.2 are cheaper still on input, but they sit in a different tier on reasoning. The decision is rarely price alone; it is price weighed against integration cost and the accuracy trade-off above. With AWS now hosting all three independent labs, you can also read this alongside OpenAI's GPT-5.5 and its 1M-context options when you shortlist.

07 — The Governance GapCompliance credentials are not the same as stability.

This is the analysis that separates a model-selection decision from a vendor-risk decision. Grok 4.3 has the certifications a regulated team looks for — xAI maintains SOC 2 Type II, HIPAA eligibility, and GDPR compliance for production workloads, and on Bedrock the infrastructure layer inherits AWS's security posture. On paper, a finance or healthcare team can deploy it today on rails it already trusts.

The certifications, though, describe the model and the platform — not the organization behind the model, which is in unusual flux. xAI merged into SpaceX in February 2026 in an all-stock transaction, and by May 2026 the plan was for xAI to cease to exist as a separate company, with Grok and X folded into a SpaceX AI division. Nine of the eleven original co-founders have departed (some accounts say ten; the most-cited figure is 9 of 11). And in June 2026 a former engineer filed a whistleblower retaliation lawsuit alleging he was fired for raising safety concerns about Grok — a claim naming both xAI and SpaceX, and unproven at the time of writing.

What this means for regulated buyers

The compliance checklist passes. The open question is organizational continuity: a vendor mid-acquisition, a departed founding team, a pending API migration to SpaceX infrastructure, and an active safety lawsuit. None of that is disqualifying on its own, but a long-term procurement that bets on stable API behavior and roadmap deserves more diligence here than the certifications alone would suggest. The analyst case is even sharper.

Some industry analysts argue the Bedrock listing is less about enterprise pull than about compute deals. The pattern they point to: AWS has tended to secure large commitments to its custom Trainium silicon around the labs it adds to Bedrock, and xAI trains Grok on a very large NVIDIA GPU cluster at its Memphis Colossus site — a natural migration target for Amazon's chips. On that reading, the launch is a chip-sales story wearing a model-availability headline. Independent signals are consistent with a slow enterprise start: of more than 400 documented federal AI deployments naming a vendor, only three involve xAI or Grok. Treat that as a caution flag on regulated-sector traction, not a verdict on the model.

"Bedrock becomes little more than a sales funnel with infuriatingly bad documentation."— Corey Quinn, cloud cost analyst, The Register, May 29, 2026

08 — Decision MatrixShould your team deploy it?

The answer is workload-specific, not headline-specific. Grok 4.3 is a strong fit for some classes of work and a poor one for others, and the right move is to sort your pipelines into the buckets below before you touch the Mantle endpoint.

Cost-sensitive reasoning

High-volume agentic workloads

At $1.25 / $2.50 with strong agentic and tool-calling scores, Grok 4.3 is the price-performance pick on Bedrock for support automation and tool-use pipelines — provided you account for the ~44% extra output tokens.

Strong candidate

Long-document RAG

Million-token workloads

The 1M window is real, but cost doubles above 200K tokens. Model the doubled tier honestly; if your prompts routinely exceed 200K, the effective price may erase the headline advantage versus a 200K-window model.

Run the cliff math first

Regulated verticals

Finance, healthcare, legal

Certifications are in place, but the non-hallucination regression and vendor-in-flux risk warrant extra diligence. For high-liability outputs, keep a frontier accuracy model in the loop and pilot before committing.

Pilot with guardrails

Bedrock-standardized teams

Converse / InvokeModel shops

Grok runs on Mantle, not bedrock-runtime, so it is a separate OpenAI-compatible integration path, not a config flip. Budget the engineering before promising it, and set the non-default temperature / top_p explicitly.

Scope the integration

For most teams the pragmatic sequence is the same: shortlist Grok 4.3 on price, prototype against the Mantle endpoint on your own prompts, measure real token spend with the 200K cliff and the extra output tokens factored in, and run your highest-liability prompts through an accuracy check before you trust it in production. Deciding between Grok and closed frontier for specific pipelines is exactly the kind of comparative evaluation our AI and digital transformation engagements start with — and the kind of multi-vendor routing we help teams stand up so no single vendor's instability becomes a single point of failure.

09 — ConclusionA genuine bargain with real fine print.

The shape of Grok on Bedrock, June 2026

The cheapest US-lab reasoning model on Bedrock — read the fine print before you budget.

Grok 4.3 on Amazon Bedrock is a real event, not just a catalog line. xAI is now the third independent lab on the platform, and at $1.25 / $2.50 per million tokens it is the cheapest US-lab frontier reasoning model there. For high-volume agentic and tool-use workloads, that price-performance profile is genuinely compelling — and the agentic benchmark jump over the prior generation is the strongest part of the story.

The fine print is where teams will win or lose money. The 1M window doubles in price above 200K tokens, so long-document workloads do not run at the headline rate. The model lives on the Mantle endpoint, so it is a separate integration for anyone standardized on Bedrock's own SDK. And the accuracy story is mixed: more correct answers, but a measured regression on hallucination that matters most in exactly the regulated verticals AWS is targeting.

The broader signal is that frontier capability is no longer the scarce thing — distribution, price discipline, and organizational stability are. Grok 4.3 brings the first two convincingly. The third, with a vendor mid-restructuring, is the open question. The right response is not a vendor decision off a headline; it is your own eval on the prompts you care about, with the cliff, the endpoint, and the accuracy trade-off all priced in.

Grok 4.3 Lands on Bedrock: xAI Goes Enterprise