Mistral OCR 4 is a document-AI model, released June 23, 2026, that returns a structured representation of any enterprise document — bounding boxes, block types, and confidence scores — rather than the flat wall of text earlier OCR generations produced. For teams building document-automation pipelines, that shift is the headline: extraction stops being a parsing problem and becomes a structured-data problem.

It is Mistral’s fourth OCR generation in roughly 15 months, and it lands in a crowded field — Google Document AI, Amazon Textract, Azure Document Intelligence, ABBYY, and a wave of open-weight models. What sets OCR 4 apart is less any single benchmark number and more the combination of aggressive pricing ($4 per 1,000 pages, $2 in batch), a single-container self-hosting option, and a structured output that removes integration layers teams used to build by hand.

This guide covers what actually shipped, why the structured output changes automation economics, an honest read of the benchmark claims (several of which are vendor-stated), a recomputed cost comparison against the hyperscalers, and how to put OCR 4 to work without overpromising on the numbers.

Key takeaways

01
Structured representation, not just text.OCR 4 returns bounding boxes, block types (title, table, equation, signature), and page- and word-level confidence scores as first-class model outputs — the raw material for traceable, auto-approving pipelines.
02
Pricing is the durable advantage.$4 per 1,000 pages standard, $2 in batch, $5 for schema-driven Document AI. At batch pricing it undercuts Azure Document Intelligence's custom tier by up to 15x (about 7.5x at the standard rate).
03
Read the benchmarks with care.The 85.20 OlmOCRBench, 93.07 OmniDocBench, and ~72% win-rate figures are vendor-stated. On the public OlmOCRBench leaderboard (last updated May 21, 2026) OCR 4 would rank roughly third — not first.
04
Self-hosted, but not open-source.A single-container deployment keeps sensitive documents inside your own jurisdiction — useful as the EU AI Act's high-risk obligations approach on August 2, 2026. Self-hosting a commercial model is not the same as open weights.
05
Mistral is buying the document layer.Targeting €1B revenue in 2026 (up from ~€200M) and reportedly in early talks to raise ~€3B at roughly €20B, Mistral is pricing OCR 4 to win the enterprise ingestion layer for RAG and search.

01 — What ShippedA fourth OCR generation that returns structure, not text.

Mistral describes the leap plainly. Where earlier generations focused on converting a page into clean text and tables, OCR 4 returns a structured representation of the document. In practice that means every extracted element arrives with its location on the page, a type label, and a confidence score — three things downstream automation historically had to reconstruct or guess at.

The model accepts PDF, DOC, PPT, and OpenDocument files directly, so the full spread of back-office documents flows in without a pre-conversion step. Mistral reports coverage across 170 languages in 10 language groups, with gains on rare and low-resource languages where competing systems tend to degrade — a meaningful detail for multinational document pipelines.

Release

OCR generation

4th

Mistral's fourth OCR model in roughly 15 months, released June 23, 2026. Available through the Mistral API and Studio, Amazon SageMaker, and Microsoft Foundry at launch, with Snowflake Parse Document integration announced as coming soon.

June 23, 2026

Inputs

PDF · DOC · PPT · ODF

4fmt

Accepts PDF, DOC, PPT, and OpenDocument files directly — the full spread of back-office documents, with no separate pre-conversion step before extraction.

No pre-conversion

Languages

10 language groups

170

Reported coverage across 170 languages, with gains on rare and low-resource languages where competing OCR systems tend to lose accuracy first.

Low-resource gains

Where to get it

OCR 4 is available at launch via the Mistral API (Mistral Studio), Amazon SageMaker, and Microsoft Foundry, with Snowflake Parse Document integration announced as coming soon. The pricing page lists OCR 4 as a Premier model and markets it as “the world’s best document extraction and understanding model” — a vendor claim, not an independent verdict.

02 — Structured OutputWhy bounding boxes change the pipeline.

Bounding boxes were OCR 4’s most-requested feature, and the reason is structural. Without location data, a downstream RAG or compliance pipeline cannot trace an extracted fact back to the page it came from — the traceability gap that makes audit-ready extraction genuinely hard. With coordinates attached to every element, an extracted number can point back to the exact cell it was read from.

Block classification does similar work one layer up. OCR 4 assigns every element a type — title, table, equation, signature, and others — as a first-class model output rather than a separate post-processing stage. That removes an integration layer enterprise teams used to build in-house just to tell a heading from a table. Confidence scores complete the set: they operate at both page and word level, which is what makes confidence-gated automation possible.

Location

Bounding boxes

coordinates per element

The most-requested OCR 4 feature. Every extracted element carries its position on the page, so a fact can be traced back to its source region — the prerequisite for audit-ready RAG and compliance workflows.

Source traceability

Type

Block classification

title · table · equation · signature

Each element is typed as a first-class model output, not a bolted-on post-processing step. That removes an integration layer teams previously built by hand to distinguish headings, tables, and signatures.

Removes a layer

Certainty

Confidence scores

page level + word level

Inline confidence at both page and word granularity. Auto-approve high-confidence regions and route only low-confidence ones to a human reviewer — no need to read every page.

Confidence-gated

Mistral, in its own words

“Mistral OCR 4 extracts and structures content from a wide range of documents. Where previous generations focused on converting a page into clean text and tables, OCR 4 returns a structured representation of the document.” That single shift — from a text blob to a typed, located, scored object — is what lets a team build a pipeline that approves itself most of the time.

There is a second, related product worth separating clearly. OCR 4 is the extraction model. Document AI is a Studio product, priced at $5 per 1,000 pages, that wraps OCR 4 with a second-pass mistral-small-2603 call to reshape the output into custom JSON schemas. If your automation needs fields in a fixed shape rather than a generic structured document, Document AI is the mode to evaluate — and its dependence on the smaller model is one reason the Mistral Small model family matters to the document stack.

03 — The Cost CaseThe number that actually moves a decision.

Strip away the benchmarks and the durable advantage is price. OCR 4 is $4 per 1,000 pages on the standard API and $2 per 1,000 in batch mode — a 50% discount for non-interactive workloads. Schema-driven Document AI is $5 per 1,000. The table below annualizes published list pricing across three volumes; every Mistral, Azure, and Google cell is the per-1,000-page rate multiplied out by hand.

Annualized document-AI cost by provider and tier at three yearly page volumes, derived from published per-1,000-page list pricing as of June 25, 2026. Mistral, Azure, and Google figures are list prices multiplied by volume. The Baidu Unlimited-OCR row carries no per-page fee; its self-hosted cost is an infrastructure estimate (GPU compute plus operations), not a quoted rate. Pricing changes frequently — verify on each vendor’s pricing page before budgeting.
Offering	Per 1,000 pages	10K pages / yr	100K pages / yr	1M pages / yr
Mistral OCR 4
OCR 4 — Batch API	$2.00	$20	$200	$2,000
OCR 4 — Standard API	$4.00	$40	$400	$4,000
Document AI (schema JSON)	$5.00	$50	$500	$5,000
Hyperscaler document AI
Azure Doc Intelligence — Read	$1.50	$15	$150	$1,500
Azure Doc Intelligence — Custom	$30.00	$300	$3,000	$30,000
Google Form Parser	~$30.00	~$300	~$3,000	~$30,000
Open-weight, self-hosted
Baidu Unlimited-OCR	No per-page fee*	Infra only*	Infra only*	Infra only*

The 100K-page row is the one to sit with. A firm processing 100,000 pages a year pays $200 in OCR 4 batch mode versus $3,000 on Azure Document Intelligence’s custom extraction tier — a 15x gap that holds at every volume. At the standard $4 API rate the gap is about 7.5x. Azure’s Read tier matches OCR 4’s entry economics at $1.50 per 1,000, but it returns text without the bounding boxes and block types that make OCR 4’s output automation-ready.

A caveat on the bottom row. Baidu’s open-weight Unlimited-OCR has no per-page license fee, but “free” is not zero: you pay for GPU compute, deployment, and operations. A precise per-page figure depends on your hardware, utilization, and throughput, so the table marks those cells as an infrastructure estimate rather than a quoted rate. The honest comparison is a managed per-page price against an amortized infrastructure cost you have to model for your own load.

04 — BenchmarksThe scores, read honestly.

Mistral reports a top-line OlmOCRBench score of 85.20 and calls it the “top overall score.” That claim deserves a caveat. The public OlmOCRBench leaderboard — last updated May 21, 2026, before OCR 4’s release — places Infinity-Parser2-Pro at 87.6 and Chandra-2 at 85.9 above it, and VentureBeat independently notes OCR 4 would rank roughly third on the current public board. OCR 4’s 85.20 is a vendor-submitted figure that does not yet appear on the independently reproduced leaderboard.

OlmOCRBench · vendor-stated OCR 4 vs the independent leaderboard

Source: Mistral (OCR 4, vendor-stated); OlmOCRBench public leaderboard via CodeSOTA, last updated May 21, 2026

Infinity-Parser2-ProIndependently reproduced · public leaderboard #1

87.6

Public board

Chandra-2Independently reproduced · public leaderboard

85.9

Public board

Mistral OCR 4Vendor-submitted · not yet on the public board

85.20

Vendor-stated

Dots.mocrIndependently reproduced · public leaderboard

83.9

Public board

Mistral OCR 4 (vendor-stated)OlmOCRBench public board · May 21, 2026

The rest of the benchmark story is similarly vendor-framed, and worth keeping in that frame. OlmOCRBench itself, built by the Allen Institute for AI, runs 7,010 unit tests across 1,403 PDFs in seven categories, with per-score uncertainty of roughly a point either way — so small gaps between models are inside the noise. The figures below are Mistral’s own; treat them as directional.

OmniDocBench

Vendor-stated

93.07

Mistral's reported OmniDocBench score. For context, PaddleOCR-VL-1.6 self-reports 96.33, though that result has not been independently reproduced on the public leaderboard either.

Not third-party verified

Human eval

Average win rate

~72%

Average head-to-head win rate against leading competitors across 600+ real-world documents in 12+ languages, judged by independent annotators Mistral commissioned. The annotators were independent; the study was vendor-run.

Vendor-commissioned

Internal eval

Crawl Multilingual

.98

Mistral's internal multilingual evaluation, reported as leading across all eight language groups. This is an internal benchmark and cannot be independently verified.

Internal · unverifiable

The transparency worth noting

Mistral did something unusual: it published the scoring artifacts it found in OlmOCRBench — ground-truth errors, equivalent LaTeX notation scored as mismatches, column-reading assumptions, header/footer attribution issues — and wrote that it therefore treats the aggregate score as “directional rather than definitive.” Read that as a signal of engineering credibility, and as a template for how to weigh any vendor’s OCR benchmark, not just this one.

What does the 85.20 actually measure? The table below maps OlmOCRBench’s seven categories to the back-office failure each one predicts — the difference between an abstract leaderboard and a decision about whether to trust extraction on your own documents.

The seven OlmOCRBench test categories (from the Allen Institute olmOCR-bench dataset, 7,010 tests across 1,403 PDFs) mapped to what each measures and the back-office failure mode it predicts. The failure-mode column is Digital Applied editorial interpretation, not part of the benchmark.
Category	What it checks	What failure looks like in your workflow
arXiv Math	Equation fidelity in LaTeX	A formula in a research report or actuarial model transcribes wrong, silently changing a result.
Tables	Row/column structure recovery	An invoice or financial statement loses its grid, so totals land in the wrong fields downstream.
Headers / Footers	Boilerplate vs body separation	Page numbers, disclaimers, or letterhead bleed into the extracted body text of a contract.
Multi-Column	Reading order across columns	A two-column policy or terms document gets interleaved, scrambling clauses out of sequence.
Old Scans	Degraded-image legibility	An archived deed, claim file, or shipping record returns garbled text the pipeline cannot trust.
Old Scans Math	Formulas on degraded scans	Both failure modes stack — a faint historical engineering or finance document loses its numbers.
Long / Tiny Text	Dense or small-font passages	Fine-print footnotes and dense appendices — exactly where the binding terms hide — drop out.

05 — DeploymentAPI, marketplace, or your own jurisdiction.

OCR 4 ships three ways to consume it, and the third is the strategic one. Beyond the managed API and the cloud marketplaces, Mistral supports a single-container self-hosted deployment — letting a regulated enterprise process sensitive documents entirely inside its own infrastructure, with no routing to an external U.S.-jurisdiction cloud API. For organizations weighing the broader tradeoffs, our self-hosted deployment decision guide covers the infrastructure side in depth.

Managed

Mistral API / Studio

Lowest-friction path. Per-page billing, batch discount for non-interactive jobs, and Document AI schema extraction in the same place. Best when data residency is not a hard constraint and you want to move fast.

Fastest to ship

Cloud

SageMaker · Microsoft Foundry

Run OCR 4 inside an existing AWS or Azure footprint, billed through accounts you already govern. Snowflake Parse Document integration is announced as coming soon. Best when you have committed cloud spend and procurement rails.

Inside your cloud

Sovereign

Single-container self-host

Documents never leave your infrastructure — the answer to data-residency and sovereignty requirements as the EU AI Act's high-risk obligations approach on August 2, 2026. Self-hosting a commercial model is not the same as open weights.

Documents stay home

"At some point, you need to be able to turn it off or turn it on, and you don't want to leave it to another country."— Arthur Mensch, CEO, Mistral AI, on AI sovereignty (London Tech Week, June 2025)

One precise distinction

Self-hosted does not mean open-source. OCR 4 is a commercial API product with an enterprise self-hosting option; the weights are not openly licensed the way a true open-weight model’s are. If open weights are a hard requirement, Baidu’s Unlimited-OCR is the model to look at — not OCR 4 in a container.

06 — The FieldWho OCR 4 is actually competing with.

OCR 4 arrives against established hyperscaler document AI and a fast wave of open-weight models. The most direct counterpoint shipped one day earlier: Baidu’s Unlimited-OCR, a 3-billion-parameter, MIT-licensed model that parses entire PDFs in a single forward pass and gathered roughly 1,800 GitHub stars in its first 24 hours. It is free and self-hosted — and it has no managed API and no enterprise SLA, which is exactly the gap OCR 4’s paid tier fills. Mistral’s own open-weight model lineage is part of why a self-hosting story is even credible from this vendor.

Hyperscaler

Azure Document Intelligence

The incumbent comparison. Read tier at $1.50 per 1,000 pages matches OCR 4's entry price but without bounding boxes; the custom extraction tier runs $30 per 1,000 — the 15x gap at the top of this post.

Incumbent on Azure

Hyperscaler

Google & Amazon

Google's Form Parser runs ~$30 per 1,000 pages; Amazon Textract is the established AWS option. Deep ecosystem integration, but priced well above OCR 4's per-page economics for structured extraction.

Ecosystem default

Open weight

Baidu Unlimited-OCR

Free, MIT-licensed, self-hosted, 3B params, single-pass PDF parsing. No managed API and no enterprise SLA — you own the deployment and the operations. The DIY counterpoint to a paid managed model.

Free, you run it

Established IDP

ABBYY · Textract incumbents

Mature intelligent-document-processing suites with template libraries and human-in-the-loop tooling built in. Strong on entrenched workflows; the question is per-page cost and how much of the new structured output you'd be re-buying.

Entrenched workflows

07 — Market & MomentumA land-grab for the ingestion layer.

The pricing makes more sense as strategy than as a margin play. The global intelligent document processing market was about $2.30B in 2024 and is projected to reach $12.35B by 2030 at a 33.1% CAGR, with BFSI the largest segment. OCR 4 feeds directly into Mistral’s Search Toolkit as the ingestion layer for RAG and enterprise search — so winning document extraction is really about owning the front door to every downstream AI workflow.

The financial backdrop fits that ambition. Mistral is targeting €1 billion in revenue for 2026, up from roughly €200 million in 2025, and is reportedly in early discussions to raise about €3 billion at a valuation near €20 billion — nearly double its €11.7 billion Series C from September 2025. No round has been announced as of late June. Pricing OCR 4 to undercut the hyperscalers by an order of magnitude is how you buy share in a market growing at 33% a year, and pairs naturally with Mistral’s broader enterprise AI stack.

Market 2030

IDP market forecast

$12.35B

Up from $2.30B in 2024 at a 33.1% CAGR, per Grand View Research. North America holds 32%+ of the 2024 market and BFSI is the largest end-use segment — the buyers OCR 4's sovereignty story targets.

33.1% CAGR

Revenue

2026 revenue target

€1B

Up from roughly €200M in 2025 — a 5x target (Le Monde, via VentureBeat). OCR 4 and its document-AI pipeline are central to that trajectory, which is why the per-page price is set to win share.

5x vs 2025

Valuation

Reported funding talks

~€20B

Mistral is reportedly in early discussions to raise ~€3B at roughly €20B — nearly double its €11.7B Series C (Sep 2025), when ASML took an 11% stake. No deal has been announced as of June 25.

Early discussions

08 — Putting It to WorkFrom a structured output to a pipeline that approves itself.

The practical payoff of confidence scores is a pipeline that does not ask a human to read every page. Set a threshold; auto-approve regions above it; route the rest to review. Bounding boxes give the reviewer the exact spot to look, and block types let you apply different rules to tables, signatures, and free text. That is the difference between an OCR tool and a document-automation system — and it is where the customer testimonials, hedged appropriately, point.

Two of those testimonials are worth quoting with that hedge in mind. Rogo, a financial-AI firm, reported reaching equivalent accuracy at roughly 8x lower cost and 17x lower latency versus leading agentic document parsers; Anaqua, an IP-management firm, reported OCR 4 is roughly 4x faster per page than its incumbent. Both are customer statements on single, undisclosed datasets — directional evidence, not reproduced benchmarks. The right move is to run OCR 4 against your own documents before you commit a forecast to it.

If you are mapping document automation onto a real back-office process — invoice capture, claims intake, contract review, CRM data entry — the structured output is the input, but the value is in the workflow around it. That scoping work is exactly what our AI digital transformation engagements start with: a confidence-gating design and an honest per-page cost model before any vendor commitment.

High volume, structured

Invoice & form capture

Batch mode at $2 per 1,000 pages plus confidence-gated review is the strongest fit. The 100K-page row in the cost table is this use case — $200 a year in extraction versus thousands on a custom hyperscaler tier.

OCR 4 batch + gating

Fixed-schema output

Structured JSON pipelines

When you need fields in a fixed shape, Document AI at $5 per 1,000 reshapes OCR output into custom schemas via a second-pass model — worth the premium only if the schema step earns it.

Document AI mode

Regulated data

Sovereignty-bound workloads

Single-container self-hosting keeps documents in your jurisdiction as the EU AI Act's high-risk obligations approach. Model the amortized infrastructure cost against the managed per-page price for your real volume.

Self-host, then measure

Open-weight requirement

No commercial dependency

If open weights are non-negotiable, evaluate Baidu Unlimited-OCR instead and budget for the GPU and ops you'll own. OCR 4 in a container is sovereign, but it is still a commercial license.

Open-weight alternative

09 — ConclusionA pricing move dressed as a model release.

The shape of document AI, June 2026

Document automation just became a cost question, not a capability one.

Mistral OCR 4 is best understood less as a benchmark winner than as a pricing and packaging move. The structured output — bounding boxes, block types, confidence scores — is genuinely useful and removes integration work teams used to do by hand. The per-page price, at $2 in batch, is what makes large-scale digitization economically boring in the best sense: a 100,000-page archive for $200 stops being a budget conversation.

Keep the benchmark claims in their box. The 85.20 OlmOCRBench, 93.07 OmniDocBench, and ~72% win-rate figures are vendor-stated, and on the independent public leaderboard OCR 4 would sit around third, not first. Mistral’s own “directional rather than definitive” framing is the right posture to borrow — and the reason to run the model against your own documents rather than trust the headline.

The forward read is straightforward. With Baidu shipping a free open-weight parser the day before and Mistral pricing a managed model at an order-of-magnitude discount to the hyperscalers, the margin in raw extraction is compressing fast. The value is migrating to the workflow above it — confidence-gating, schema design, and the sovereignty wrapper — and to whoever owns the ingestion layer feeding every downstream RAG and search pipeline. That, not a leaderboard row, is what OCR 4 is really competing for.

Mistral OCR 4: Document AI for Business Automation

01 — What ShippedA fourth OCR generation that returns structure, not text.

OCR generation

PDF · DOC · PPT · ODF

10 language groups

02 — Structured OutputWhy bounding boxes change the pipeline.

Bounding boxes

Block classification

Confidence scores

03 — The Cost CaseThe number that actually moves a decision.

04 — BenchmarksThe scores, read honestly.

OlmOCRBench · vendor-stated OCR 4 vs the independent leaderboard

Vendor-stated

Average win rate

Crawl Multilingual

05 — DeploymentAPI, marketplace, or your own jurisdiction.

Mistral API / Studio

SageMaker · Microsoft Foundry

Single-container self-host

06 — The FieldWho OCR 4 is actually competing with.

Azure Document Intelligence

Google & Amazon

Baidu Unlimited-OCR

ABBYY · Textract incumbents

07 — Market & MomentumA land-grab for the ingestion layer.

IDP market forecast

2026 revenue target

Reported funding talks

08 — Putting It to WorkFrom a structured output to a pipeline that approves itself.

Invoice & form capture

Structured JSON pipelines

Sovereignty-bound workloads

No commercial dependency

09 — ConclusionA pricing move dressed as a model release.

Document automation just became a cost question, not a capability one.

Structured extraction at $2 per 1,000 pages makes document automation genuinely affordable.

Document-automation engagements

The questions we get every week.

Continue exploring frontier releases.

Document AI Automation for SMBs: Build vs Buy 2026

Open Source AI Models for Enterprise: Complete Guide 2026

AWS Summit NY 2026: AgentCore, Continuum and Context

Cohere North Mini Code: An Open 30B Agentic Coding Model

AI Agent Governance: Policy and Compliance 2026 Guide

Google AI Plans: Free vs Plus vs Pro vs Ultra 2026