BusinessNew Release12 min readPublished June 21, 2026

Reported talks, not signed deals · Trainium nears parity with Blackwell · Nvidia still ~80% of the market

Amazon May Sell Its AI Chips: The Nvidia Challenge

Bloomberg reported on June 18, 2026 that Amazon is in early talks to sell its Trainium AI chips directly to data centers — a historic break from AWS-only distribution. The numbers behind the move are real; the talks are not yet deals, and Nvidia's CUDA moat is the barrier most coverage skips.

DA
Digital Applied Team
Senior strategists · Published June 21, 2026
PublishedJune 21, 2026
Read time12 min
SourcesBloomberg, TechCrunch, AWS
Nvidia accelerator share
~80%
of AI accelerators
range 75–87%
Jassy standalone chip estimate
~$50B
hypothetical run rate
shareholder letter
Trainium3 UltraServer
362
MXFP8 PFLOPs / rack
≈ Blackwell ~360
Trainium chips deployed
1.4M
all generations
as of Mar 2026

Amazon's custom AI chips may soon be for sale outside AWS. Bloomberg reported on June 18, 2026 that Amazon is in early talks to sell its Trainium AI accelerators directly to third-party data centers — a strategic break from a decade of AWS-exclusive distribution. The reporting is corroborated by TechCrunch, Benzinga, and Yahoo Finance, but the talks are explicitly preliminary; no deal has been signed.

The reason this is plausible now, and was not at earlier chip generations, is technical. Amazon's third-generation Trainium3 reaches rack-scale performance roughly on par with Nvidia's Blackwell NVL72 — the first time Amazon silicon has closed that gap at the rack level, according to third-party analysis. What stands between Amazon and a real merchant chip business is not raw performance. It is Nvidia's CUDA software ecosystem and a manufacturing pipeline that prioritizes Nvidia.

This analysis covers exactly what was reported and what was not, how large Amazon's silicon business already is, where Trainium3 genuinely matches Blackwell and where it does not, why CUDA is the real barrier, and what the whole episode signals about a broader hyperscaler-to-merchant shift in custom silicon. Every figure below is attributed; estimates are labeled as estimates.

Key takeaways
  1. 01
    These are reported talks, not signed deals.Bloomberg first reported on June 18, 2026 that Amazon is exploring direct Trainium sales to data centers. AWS AI chief Peter DeSantis confirmed the talks but named no customers. No external chip-sale agreement has been reported as closed.
  2. 02
    The chip business is already large internally.Amazon's custom-silicon division reportedly crossed a ~$20B annual run rate in Q1 2026 — but that is internal transfer pricing to AWS, not merchant revenue. CEO Andy Jassy estimated a standalone chip business could run at ~$50B annually.
  3. 03
    Trainium3 reaches rack-scale parity with Blackwell.Trainium3's 144-chip UltraServer delivers about 362 MXFP8 PFLOPs at rack scale, essentially tied with Nvidia's Blackwell NVL72 (~360 PFLOPs). One analyst estimate puts Trainium3's total cost of ownership roughly 50% lower at the rack level.
  4. 04
    CUDA, not silicon, is the real barrier.Nvidia controls roughly 80% of the AI accelerator market. Its CUDA platform has a 15-year head start, over 4 million registered developers, and deeply entrenched training pipelines. Amazon is open-sourcing parts of its Neuron stack to chip away at that lock-in.
  5. 05
    This is market expansion, not displacement.Amazon itself deployed over a million Nvidia GPUs in 2026, and both stocks rose on the news. Custom ASIC shipments are projected to grow 44.6% year-over-year versus 16.1% for merchant GPUs — both rising in absolute terms.

01What Was ReportedEarly talks, not a signed merchant deal.

On June 18, 2026, Bloomberg first reported that Amazon is in early talks to sell its Trainium AI chips directly to third-party data centers. The original Bloomberg article sits behind a paywall, but the substance was corroborated the same day by TechCrunch, Benzinga, Yahoo Finance, and Seeking Alpha. The reporting is consistent: the discussions are described as early-stage and preliminary, and Amazon has not closed any external chip-sale agreement as of the reporting date.

Peter DeSantis — Amazon's senior vice president overseeing AI and semiconductor operations, and AWS's AI chief — confirmed the talks in a Bloomberg interview conducted in Paris. He declined to name any specific potential customers. That is the entire confirmed core of the story: an executive acknowledging exploratory discussions, not an announced product line or a customer roster.

The move was foreshadowed. In his annual shareholder letter published April 9, 2026, CEO Andy Jassy wrote that demand for Amazon's chips was high enough that the company could plausibly sell racks to third parties in the future. The June reporting is the first concrete signal that the company is actively exploring how.

"We view AI infrastructure as rapidly evolving. And we're constantly looking at ways to get to more customers."— Peter DeSantis, Amazon SVP of AI and semiconductor operations, Bloomberg interview (Paris), June 2026
Read this before the numbers
Frame everything that follows as reported or estimated. The Bloomberg report is corroborated but the original is paywalled; the talks are preliminary; and several headline figures — Amazon's silicon run rate, the Trainium-vs-Blackwell cost comparison, and Nvidia's data-center revenue — are vendor-stated or third-party analyst figures, not audited disclosures. We label each one where it appears.

02The Scale Already BuiltA chip business hiding inside AWS.

Amazon's custom-silicon division — Trainium AI accelerators, Graviton CPUs, and Nitro networking chips — reportedly crossed a $20 billion annual revenue run rate in Q1 2026, growing at triple-digit rates year over year. The critical caveat: that figure represents internal transfer pricing to AWS, not merchant revenue from outside customers. Amazon has not separately broken out silicon revenue in its SEC filings, so treat the $20 billion as a reported internal metric rather than an audited line item.

The larger number that has driven headlines is Jassy's own estimate. In the April 9 shareholder letter, he framed the chip business as a standalone entity and put its hypothetical annual run rate at roughly $50 billion if it sold this year's production to AWS and third parties. That is a CEO's construct — a valuation of internally transfer-priced output at assumed merchant market rates — not reported external revenue. For scale, a $50 billion run rate would approximate Intel's annual revenue.

"There's so much demand for our chips that it's quite possible we'll sell racks of them to third parties in the future."— Andy Jassy, Amazon CEO, annual shareholder letter, April 9, 2026

The demand signal behind that confidence is unusually concrete. Jassy's letter notes that Trainium2 is completely sold out, that Trainium3 reached customers in early 2026 with nearly all supply reserved, and that Trainium4 already has substantial pre-orders despite being roughly 18 months from wide release. Benzinga similarly reported Trainium3 as largely sold out, with a fourth generation anticipated next year. Selling externally only becomes credible once internal demand is satisfied — and the sold-out posture is part of why this remains talks, not a product launch.

Two anchor customers explain much of that demand. Anthropic committed more than $100 billion over ten years to AWS technologies and up to 5 gigawatts of new Trainium capacity spanning Trainium2 through Trainium4, with Amazon investing an additional $25 billion in Anthropic ($5 billion immediately, $20 billion milestone-tied), announced April 20, 2026. OpenAI committed approximately 2 gigawatts of Trainium capacity through AWS. Project Rainier, described as one of the world's largest AI compute clusters, went live in late 2025 on 500,000 Trainium2 chips dedicated to Anthropic.

Internal run rate
Custom silicon (reported)
20B

Trainium, Graviton, and Nitro reportedly crossed a ~$20B annual run rate in Q1 2026 — internal transfer pricing to AWS, not merchant revenue, and not separately broken out in SEC filings.

Q1 2026 · reported
Standalone estimate
Jassy's hypothetical
50B

CEO Andy Jassy estimated the chip business could run at ~$50B annually if it sold this year's output to AWS and third parties. A construct, not reported external revenue — roughly Intel's annual scale.

Apr 2026 letter
Deployed fleet
Trainium chips, all gens
1.4M

Over 1.4M Trainium chips are deployed across AWS, with more than 1M Trainium2 chips running Anthropic's Claude in production as of March 2026.

Mar 2026

03The Technical InflectionWhy Trainium3 reaching rack-scale parity changes the calculus.

Most coverage reports the talks without explaining why they are plausible now. The answer is rack-scale parity. According to third-party analysis from Oplexa, Trainium3's UltraServer — 144 chips per rack — delivers approximately 362 MXFP8 PFLOPs at rack scale, essentially tied with Nvidia's Blackwell NVL72 at around 360 PFLOPs. The same analysis estimates Trainium3's total cost of ownership at roughly 50% lower than Blackwell at the rack level. Treat that TCO figure as an analyst estimate — it comes from Oplexa, not from AWS or Nvidia.

The nuance matters because the comparison flips at the chip level. At the individual-chip level, the same analysis puts Trainium3 at about 2.52 PFLOPs against Nvidia's B200 at roughly 20 PFLOPs. Amazon does not win per-chip; it wins per-rack, by stacking 144 chips densely and pricing the whole rack lower. That is the technical inflection point that makes external sales credible for the first time — and it is a rack-level story, not a silicon-supremacy story.

Trainium3 vs Blackwell · rack-scale parity, per-chip gap

Source: Oplexa third-party analysis (rack-scale PFLOPs); per-chip and per-rack PFLOPs are not directly comparable scales
Trainium3 UltraServer (rack)144 chips · MXFP8 PFLOPs, rack scale
362
Nvidia Blackwell NVL72 (rack)Rack-scale PFLOPs · third-party estimate
~360
Nvidia B200 (single chip)Per-chip PFLOPs · Nvidia wins per-chip
20
Trainium3 (single chip)Per-chip PFLOPs · Amazon wins per-rack, not per-chip
2.52

The Trainium3 specifications behind the rack number are vendor-stated by AWS: eight NeuronCore compute engines per chip, 144 GB of HBM3e memory, 4.9 TB/s of memory bandwidth (1.7x Trainium2), and up to 2x the MXFP8 compute throughput of Trainium2, with built-in hardware support for Mixture-of-Experts routing. At the rack level, the 144-chip UltraServer aggregates to 20.7 TB of HBM3e, 706 TB/s of memory bandwidth, and 28.8 Tbps of scale-out bandwidth via Elastic Fabric Adapter. On the prior generation, AWS states that Trainium2 EC2 Trn2 instances deliver 30–40% better price-performance than GPU-based P5e and P5en instances.

The honest read on parity
Rack-scale parity is real and new; per-chip supremacy is not claimed. Amazon's advantage is density and price at the rack, not a faster single accelerator. The ~50% TCO advantage is one analyst’s estimate, not an audited figure — useful as a directional signal, not a procurement spec.

04The Generational ArcA decade of compounding from a $350M acquisition.

External sales are viable now and were not at Trainium1 or Trainium2 for a reason that only becomes visible across generations. Amazon acquired Annapurna Labs in January 2015 for roughly $350 million. The first Trainium was announced at re:Invent 2020 and shipped on a 7-nanometer node; Trainium2 launched in late 2023 on 5 nm; Trainium3 was announced at re:Invent 2025 and shipped in early 2026 on TSMC's 3 nm process. Each generation compounded performance, memory, and — crucially — manufacturing maturity.

The table below puts that progression in one view. Where a figure is vendor-stated by AWS it is marked; the Trainium4 row uses only announced framing — anticipated and roughly 18 months from wide release — because no specifications or pricing for the fourth generation exist publicly, and inventing them would be exactly the kind of fabrication this analysis avoids.

Trainium generation comparison: launch date, process node, memory per chip, availability status, and key customers across Trainium1 through the announced Trainium4.
GenerationLaunchProcess nodeMemory / chipAvailability
Shipping generations (vendor-stated specs)
Trainium1re:Invent 2020 · shipped 2021–227 nm (TSMC)Not detailed hereSuperseded
Trainium2Late 20235 nm (TSMC)Not detailed hereSold out (per Jassy)
Trainium3Announced re:Invent 2025 · shipping early 20263 nm (TSMC)144 GB HBM3eLargely sold out
Announced only (no public specs or pricing)
Trainium4Anticipated next yearNot announcedNot announcedSubstantial pre-orders · ~18 months out

The arc explains the timing better than any single spec. Trainium3 is the first generation to combine a leading-edge 3 nm node, HBM3e capacity competitive with Nvidia's top parts, and a rack architecture dense enough to reach Blackwell-class aggregate throughput. That convergence — node maturity plus memory plus rack density — is what turns "we could sell racks someday" into an active exploration in 2026.

05The Real BarrierThe barrier is CUDA, not silicon.

Even with rack-scale parity, the hardest problem Amazon faces is software. Nvidia holds roughly 80% of the AI accelerator market — estimates range from 75% to 87% depending on scope and source. The durability of that share rests less on the GPUs themselves than on CUDA, Nvidia's software platform, which has a 15-year head start, more than 4 million registered developers, and over 40,000 organizations running CUDA-accelerated applications.

The lock-in is not a single line of code that a buyer can swap. It accumulates in kernel fusions, mixed-precision tuning, distributed training paths optimized around NCCL, and CUDA-native CI/CD pipelines. Any data center buying external Trainium would have to retrain engineers and revalidate production pipelines — a multi-year migration cost that constrains who can realistically switch, no matter how attractive the rack-level economics look on paper.

Amazon's counter is its Neuron software stack. The Neuron SDK already supports PyTorch native eager mode, FSDP, TorchTitan, vLLM, and HuggingFace Transformers with minimal code changes, and AWS plans to open-source the Neuron Kernel Interface compiler and communication libraries — an explicit play to undercut CUDA's software moat. Open-sourcing lowers the switching friction, but it does not erase 15 years of accumulated tooling, talent, and production validation overnight.

CUDA ecosystem
Nvidia's 15-year head start

Over 4 million registered developers and 40,000+ organizations. Lock-in lives in kernel fusions, NCCL-optimized training paths, and CUDA-native CI/CD — not in code that swaps cleanly.

Hardest moat to cross
Neuron SDK
Amazon's open-source counter

Supports PyTorch eager mode, FSDP, vLLM, and HuggingFace with minimal changes. AWS plans to open-source the Neuron Kernel Interface compiler — an explicit move to lower CUDA switching friction.

Lowers, doesn't erase, friction
Migration cost
Who can realistically switch

Retraining engineers and revalidating pipelines is a multi-year cost. Realistic early buyers are sovereign clouds and national operators, not commodity data centers competing head-on with hyperscalers.

Sovereign & national first
Market structure
Supplement, not replacement

Amazon deployed 1M+ Nvidia GPUs in 2026 and both stocks rose on the news. The realistic outcome is a second credible supplier at the rack level, not Nvidia's displacement.

Two suppliers, not one
The unstated customer profile
DeSantis pointed to European nations pursuing technology-independence strategies. The realistic early buyer for external Trainium is the sovereign cloud — national operators and governments motivated by independence and willing to absorb migration cost — not commodity data centers going head-to-head with hyperscalers on price alone.

06The Supply ConstraintThe TSMC bottleneck most coverage skips.

There is a second constraint that gets even less attention than CUDA: manufacturing. Trainium3 is fabricated by TSMC on a 3-nanometer node — the same leading-edge capacity that Nvidia depends on. TSMC currently prioritizes Nvidia as its largest customer. That ordering creates a hard ceiling on how quickly Amazon can scale external chip sales, because every Trainium wafer competes for the same scarce 3 nm allocation that Nvidia is buying at the front of the line.

This is the central limiting factor, and it is not price, not software, and not demand. Amazon's internal demand is already outrunning supply — Trainium2 sold out, Trainium3 largely sold out, Trainium4 pre-ordered. Selling externally means diverting constrained supply away from AWS's own marquee customers, including Anthropic and OpenAI. Until TSMC capacity loosens or Amazon secures a larger allocation, a true merchant chip business stays supply-gated regardless of how compelling the rack economics are.

"There's so much underconsumption in AI. I'm not worried about it."— Peter DeSantis, on whether external Trainium sales would cannibalize AWS cloud revenue, TechCrunch, June 18, 2026

DeSantis frames the demand side as essentially unbounded, which is why the cannibalization worry barely registers internally. But optimism about demand does not solve the supply equation. The gating question for external Trainium is not whether buyers exist — it is whether TSMC's 3 nm line can produce enough wafers for Amazon to serve outside customers without starving its own anchor commitments first.

07The Structural ShiftFrom hyperscaler-only to merchant custom silicon.

Amazon's move is not a one-off; it fits a broader structural shift. Google Cloud has similarly begun selling its custom TPUs to select customers for installation in their own data centers, and Google projects 4.3 million TPU shipments in 2026. The pattern is the same across hyperscalers: custom silicon that began as internal-only infrastructure is edging toward external, merchant availability.

The growth math underlines why. Custom ASIC shipments are projected to grow 44.6% year over year in 2026 — nearly triple the 16.1% growth projected for merchant GPUs. Custom silicon could capture 10–20% of AI training and inference markets by the end of 2026, up from under 5%. Critically, both categories are growing in absolute terms. This is not custom ASICs taking share from GPUs in a shrinking pie; it is the whole accelerator market expanding, with custom silicon expanding faster off a smaller base.

Amazon
Trainium — select-external (reported)
3 nm TSMC · Neuron SDK

In early talks to sell to third-party data centers. Anchor customers Anthropic and OpenAI; 1.4M+ chips deployed. Open-sourcing parts of the Neuron stack to chip at CUDA lock-in.

Reported June 18, 2026
Google
TPU — select-external
Custom ASIC · own data centers

Already selling custom TPUs to select customers for installation in their own data centers. Google projects 4.3 million TPU shipments in 2026 — the clearest existing precedent for the merchant move.

Selling externally
Market
Custom ASIC vs merchant GPU
Both growing in absolute terms

Custom ASIC shipments projected +44.6% YoY in 2026 vs +16.1% for merchant GPUs. Custom silicon could reach 10–20% of training/inference markets by end-2026, up from under 5%.

2026 projections

For context on the scale of what Amazon is challenging, it is worth reading this alongside Nvidia's trillion-dollar order pipeline and Nvidia's Vera Rubin architecture, which together define the moving target. The Anthropic and OpenAI commitments also sit squarely inside the broader AI infrastructure investment picture, and the custom-ASIC growth projections track the wider AI infrastructure spending forecasts.

08What It MeansWhat this signals for businesses and engineering teams.

For most organizations, the immediate practical impact is small — you cannot buy external Trainium today, and the talks may not produce a product line for some time. The signal worth acting on is directional: the AI accelerator market is moving from a single dominant supplier toward a genuine two-or-three-supplier structure at the rack level, and that has real implications for cost, procurement, and architecture over the next 18 months.

Our own forward read: external Trainium will land first with sovereign and national cloud operators, not commodity data centers, and the binding constraint will be TSMC 3 nm allocation rather than price or software. If Amazon secures more leading-edge capacity and its Neuron open-sourcing gains traction, a credible second rack-level supplier emerges by late 2026 or 2027 — enough to soften Nvidia's pricing power at the margin, not enough to break the CUDA ecosystem for general workloads. The teams that benefit are those that build hardware-agnostic pipelines now, before they need to.

Accelerator market · custom ASIC vs merchant GPU growth, 2026

Source: Tom's Hardware ASIC projections; Silicon Analysts share estimate (range 75–87%)
Custom ASIC shipmentsProjected YoY growth, 2026
+44.6%
Faster off a smaller base
Merchant GPU shipmentsProjected YoY growth, 2026
+16.1%
Larger absolute base
Nvidia accelerator shareRange 75–87% depending on scope
~80%
Still dominant
Custom silicon market sharePotential by end-2026, up from <5%
10–20%
Rising fast
Custom ASIC growthMerchant GPU growth

On the demand side, watch Nvidia's own roadmap as the counter-move — the local-inference and latest Nvidia silicon push is part of how Nvidia defends share against custom challengers. For organizations weighing how any of this changes their AI cost base, our AI and digital transformation engagements start with exactly this kind of vendor-and-architecture analysis, and our analytics work turns it into a measurable cost model rather than a headline reaction.

09ConclusionA credible challenge, gated by software and supply.

The shape of the Trainium challenge, June 2026

Amazon has the rack, the demand, and the cash — what it lacks is the software ecosystem and the wafers.

The reported talks are a genuine inflection, not hype. For the first time, Amazon silicon reaches rack-scale parity with Nvidia's best, internal demand is large enough to make a merchant business plausible, and a CEO has put a ~$50 billion figure on the standalone opportunity. That combination did not exist at earlier Trainium generations.

But the honest read is that this is a challenge, not a coronation. Nvidia still controls roughly 80% of the market, and the barriers Amazon must cross are the two least discussed in the headlines: CUDA's 15-year software moat and TSMC's 3 nm capacity, which Nvidia sits at the front of. Neither is solved by faster racks or lower prices. Both take years.

The clearest signal is the one underneath the Amazon-versus-Nvidia framing: this is market expansion, not displacement. Amazon deployed over a million Nvidia GPUs in 2026, both stocks rose on the news, and custom silicon is growing faster than GPUs while both grow in absolute terms. The right question is not "who wins" but "how many credible suppliers exist at the rack level in 18 months" — and the answer, for the first time, looks like more than one.

Turn AI hardware shifts into a cost strategy

A second credible chip supplier changes the math — plan for it before you need to.

Our team helps businesses interpret AI hardware shifts, model their accelerator cost base, and build vendor-agnostic AI pipelines — so a change in the silicon market becomes a planned decision, not a scramble.

Free consultationExpert guidanceTailored solutions
What we work on

AI infrastructure & cost strategy

  • Accelerator cost modeling — GPU vs custom ASIC scenarios
  • Vendor-agnostic AI pipeline design
  • Long-context and inference cost optimization
  • AI transformation roadmaps grounded in real economics
  • Analytics that turn AI spend into a measurable model
FAQ · Amazon Trainium vs Nvidia

The questions we get every week.

Not yet. Bloomberg first reported on June 18, 2026 that Amazon is in early talks to sell its Trainium AI chips directly to third-party data centers, and AWS AI chief Peter DeSantis confirmed those talks in a Bloomberg interview in Paris. But the discussions are explicitly described as early-stage and preliminary, no external chip-sale agreement has been reported as closed, and DeSantis declined to name any potential customers. The Bloomberg original is behind a paywall, though the substance was corroborated the same day by TechCrunch, Benzinga, and Yahoo Finance. Treat this as reported exploration, not a launched product line.