AI Development15 min readMarket share reportQ2 2026

Chinese AI Models Q2 2026: 10-Provider Landscape Report

Q2 2026 market share report on Chinese AI providers — Qwen, GLM, DeepSeek, Kimi, MiniMax, Baichuan, and Yi. Usage data, licensing, and enterprise adoption.

Digital Applied Team

April 12, 2026

15 min read

Providers covered

45%+

Chinese share of OpenRouter

Xiaomi vs OpenAI ratio

21.1%

Xiaomi share of tokens

Key Takeaways

Chinese share crossed 45% of OpenRouter traffic: One year ago, Chinese AI providers accounted for less than 2% of OpenRouter tokens. In April 2026, the combined share of Xiaomi, Alibaba, MiniMax, Zhipu, DeepSeek, and StepFun exceeds 45% of total weekly volume.

Xiaomi alone has 3x OpenAI's share: Xiaomi processes 4.21T weekly tokens on OpenRouter for a 21.1% market share, compared to OpenAI's 7.5%. MiMo-V2-Pro is the single most-used model on the platform by a wide margin.

Usage and intelligence are decoupling: The highest-volume Chinese models rank outside the intelligence top 10. Developers are optimizing for blended cost per token and specific capabilities like coding and long context, not raw benchmark leadership.

The field has consolidated to ten serious providers: Xiaomi, Alibaba, Zhipu, DeepSeek, Moonshot, MiniMax, StepFun, ByteDance, Baidu, and Tencent cover the meaningful share of Chinese AI output. Smaller labs from 2024 either merged, pivoted, or fell out of volume rankings.

Coding is the wedge that opened the market: MiMo-V2-Pro and Qwen 3.6 Plus together account for roughly 49% of all coding tokens on OpenRouter. Free-preview access, strong SWE-Bench numbers, and 1M-context windows made Chinese models the default for AI IDE tooling.

Huawei Ascend is the enterprise procurement story: GLM-5 and the upcoming DeepSeek V4 are trained and served on Chinese domestic silicon. For Chinese enterprises and state buyers, that hardware independence is worth more than a benchmark point.

Licensing favors open weights outside the flagship tier: GLM-5 ships under MIT, Qwen 3.5 flagship under an Apache-style commercial license, and Kimi K2.5 as open weights. Xiaomi and MiniMax keep flagship weights closed, matching the pattern established by OpenAI and Anthropic.

Chinese AI providers now serve over 45% of all OpenRouter traffic, up from less than 2% a year ago. Xiaomi alone has 3x OpenAI's share. This is the Q2 2026 landscape report: who the ten meaningful providers are, what they ship, how they price it, and how to evaluate them for production workloads.

The shift is not a benchmark story. Chinese models do not yet lead the Artificial Analysis Intelligence Index — MiMo-V2-Pro ranks #10 despite being the #1 model by usage. The shift is a cost, availability, and developer-choice story. Free-preview access, 1M token context windows, and per-token prices three to ten times below US frontier models have moved the default backend for AI coding IDEs, agent platforms, and cost-sensitive production workloads east.

Data source: OpenRouter Rankings retrieved April 3, 2026. All market-share figures reflect weekly token volume across the OpenRouter API, which is the most public and auditable usage dataset for hosted AI models. For the full ranking breakdown see our April 2026 OpenRouter rankings analysis.

Q2 2026 Landscape at a Glance

Ten providers cover essentially all meaningful Chinese AI output in Q2 2026. The group has consolidated sharply from the 2024 frontier when dozens of labs published competing checkpoints. Today the volume flows through Xiaomi, Alibaba, Z.ai (Zhipu), DeepSeek, Moonshot AI, MiniMax, StepFun, ByteDance, Baidu, and Tencent, with Baichuan, Yi, Xunfei, and KwaiKAT operating as second-tier niche players.

Rank	Provider	Weekly tokens	Share	Flagship model
1	Xiaomi	4.21T	21.1%	MiMo-V2-Pro
2	Alibaba (Qwen)	2.77T	13.9%	Qwen 3.6 Plus
3	MiniMax	1.62T	8.1%	MiniMax M2.7
4	Z.ai (Zhipu)	1.12T	5.6%	GLM-5 / GLM-5 Turbo
5	DeepSeek	1.11T	5.6%	DeepSeek V3.2
6	StepFun	1.07T	5.3%	Step 3.5 Flash
7	Moonshot AI	Sub-rank	Tracked	Kimi K2.5
8	ByteDance	Sub-rank	Tracked	Seed 2.0 (Doubao)
9	Baidu	Domestic	Domestic	ERNIE 5.0
10	Tencent	Domestic	Domestic	Hunyuan (internal)

Share figures reflect OpenRouter weekly token volume at the provider level. Baidu, Tencent, and Moonshot concentrate usage on domestic Chinese surfaces and partner ecosystems rather than OpenRouter, so their global ranking understates their home-market presence. For Western buyers evaluating these providers, the OpenRouter data is still the most defensible apples-to-apples benchmark.

Mapping models to your stack? Model selection is rarely a single-benchmark decision. Explore our AI Digital Transformation service to translate this landscape into a production-ready architecture.

Xiaomi: The Phone Company Dominating AI Volume

Xiaomi is the story of this report. A consumer electronics company best known for smartphones and smart-home hardware holds 21.1% of OpenRouter weekly tokens, three times OpenAI's 7.5%. Xiaomi's AI lab shipped three frontier checkpoints between December 2025 and March 2026 under the MiMo brand, and each variant carved out a clear slot in the pricing curve.

MiMo V2 Pro

March 18, 2026 · Flagship

1.04M context, $1 input / $3 output per million tokens. The #1 model on OpenRouter at 4.79T weekly tokens and 25.5% of all coding traffic.

MiMo V2 Omni

March 18, 2026 · Omnimodal

262K context, $0.40 input / $2 output per million tokens. Unified image, video, and audio architecture in a single checkpoint.

MiMo V2 Flash

December 2025 · Ultra-cheap

262K context, $0.09 input / $0.29 output per million tokens. Top open-source claim on general reasoning at this price point.

For deeper technical coverage, see our MiMo-V2-Pro trillion-parameter release guide and the MiMo-V2-Omni omnimodal release guide.

Why Xiaomi won volume

Three decisions drove the rankings. First, MiMo-V2-Pro shipped on OpenRouter with a free preview tier that AI coding IDEs adopted as a default backend within weeks. Second, the 1.04M context window matched Qwen 3.6 Plus and exceeded most US frontier context ceilings, making MiMo the obvious choice for whole-repo refactors. Third, the pricing gap versus Claude Opus 4.6 is roughly 5x at input and 8x at output, which matters when agent frameworks expand token consumption by an order of magnitude.

Alibaba Qwen: The Scale Leader

Alibaba's Qwen family is the second-ranked provider at 13.9% share and 2.77T weekly tokens, but it is the broadest product line in Chinese AI. Between January 23 and April 2, 2026, Alibaba shipped six named Qwen releases covering coding, reasoning, ultra-cheap inference, omnimodal, and flagship tiers.

Qwen 3.6 Plus (Apr 2, 2026) — Flagship. 1M context, 65K output, always-on chain-of-thought, native function calling. Free during preview. Currently #2 on OpenRouter with 1.64T weekly tokens.
Qwen 3.5-Omni (Mar 30, 2026) — Native omnimodal. 256K context, 113 languages for speech recognition, Thinker-Talker architecture. Mostly closed-source.
Qwen 3.5 Flash (Feb 24, 2026) — Ultra-cheap high context. 1M tokens, $0.065 input / $0.26 output per million. Top pick for cost-sensitive batch workloads.
Qwen 3 Coder Next (Feb 3, 2026) — Coding-specific. 256K context, $0.12 input / $0.75 output per million. Purpose-built for IDE integrations.
Qwen 3 Max Thinking (Jan 23, 2026) — Reasoning variant. 262K context, $0.78 input / $3.90 output per million. Positioned against GPT-5.4 Pro and Opus 4.6 on hard problems.
Qwen 3.5 small series (Mar 2-3, 2026) — On-device. 0.8B to 9B parameters. The 9B variant beats several closed US models on GPQA Diamond.

For current flagship detail, see our Qwen 3.6 Plus 1M-context release guide.

Alibaba is also the provider most actively pushing Chinese AI into cross-border commerce and enterprise workflows through Alibaba Cloud International. For Western teams evaluating Chinese models under compliance constraints, Qwen's licensing terms and cloud footprint are typically the least painful starting point.

Zhipu GLM: Enterprise Enablement

Z.ai — the international brand for Zhipu AI — holds 5.6% OpenRouter share but commands an outsized share of Chinese domestic enterprise procurement. GLM-5 launched February 11, 2026 with 744B total parameters, 44B active per token, a Mixture-of-Experts architecture with 256 experts, and an MIT license. The model trains and serves end-to-end on Huawei Ascend silicon, making it the cleanest answer for Chinese state-owned and export-control-sensitive buyers.

GLM-5 at a glance

744B total parameters, 44B active, 200K context window.
77.8% SWE-bench Verified, competitive with Claude Sonnet 4.6.
$0.80 input / $2.56 output per million tokens on direct API, $1.20 / $4 on the GLM-5 Turbo faster variant.
MIT licensed weights — among the most permissive in the Chinese frontier tier.
GLM-5V-Turbo (April 1, 2026) adds 744B multimodal vision and agentic browsing benchmarks.

For the full architectural and benchmark breakdown, see our Zhipu GLM-5 744B MoE release analysis.

Zhipu's positioning is the clearest example of the bifurcation in Chinese AI. While Xiaomi and Alibaba optimize for global developer volume, Zhipu optimizes for Chinese enterprise deployment, domestic-hardware independence, and permissive licensing. For Western self-hosters and researchers, GLM-5's MIT license is often the deciding factor over the Qwen and Kimi licensing frameworks.

DeepSeek: Open-Weight Economics

DeepSeek holds 5.6% OpenRouter share through DeepSeek V3.2, a 685B parameter MoE model released December 2025 with DeepSeek Sparse Attention, gold-medal performance on the 2025 IMO and IOI, and Thinking-in-Tool-Use behavior. The Speciale variant reportedly surpasses GPT-5 on specific reasoning benchmarks, though direct head-to-head evaluations remain hard to replicate.

For a detailed walkthrough of the V3.2 release, see our DeepSeek V3.2 and Speciale complete guide.

DeepSeek V4 status: As of April 2026, V4 is expected but not released. Public reporting points to ~1T total parameters, 1M context, and Huawei Ascend as the primary hardware stack, with projected pricing in the $0.10-$0.30 per million input range. Treat any V4 timeline as unconfirmed until DeepSeek publishes an official release.

DeepSeek's strategic value is open-weight economics. V3.2 is available for self-hosting, fine-tuning, and quantization with commercial license terms that many Western teams find acceptable. When the production decision comes down to "run the model inside my own cloud region or not," DeepSeek is usually on the shortlist alongside GLM-5 and Kimi K2.5.

Moonshot Kimi: Long-Context Agents

Moonshot AI's Kimi K2.5 launched January 27, 2026 with 1 trillion total parameters, 32B active per request, a 262K context window, and $0.38 input / $1.72 output per million tokens. The standout feature is Agent Swarm technology — the ability to coordinate up to 100 agents simultaneously inside a single inference loop. Moonshot claims K2.5 beats Claude Opus 4.5 on agentic benchmarks, a claim that holds up on several public harnesses.

Kimi K2.5 is also notable as the base model powering Cursor Composer 2, which scored 73.7% on SWE-bench Multilingual at launch. That makes K2.5 one of the few Chinese models actively embedded in a Western developer tool's production stack, rather than served only as a standalone API.

For the full agent swarm architecture breakdown, see our Kimi K2.5 agent swarm open-source guide.

MiniMax: Self-Evolving Agentic Workflows

MiniMax holds 8.1% OpenRouter share and is the provider most associated with agentic self-evolution. Its M-series shipped four numbered releases in under six months: M2, M2.1, M2.5 (February 12, 2026, 80.2% SWE-Bench Verified), and M2.7 (March 18, 2026, 56.22% SWE-Pro, 10B active parameters, roughly 50x cheaper than Claude Opus per comparable workload). MiniMax M2.7 is currently #4 on OpenRouter at 1.34T weekly tokens with +24% week-over-week growth.

For deeper coverage on the M2.7 release, see our MiniMax M2.7 agentic coding release guide.

MiniMax's positioning is the inverse of Xiaomi's. Rather than competing on raw context or pricing, MiniMax leans into self-evolving architectures — models that update internal representations across multi-step runs, enabling agents to refine strategies within a single task rather than requiring human feedback cycles. For agentic workloads with well-defined reward signals, M2.7 is often the best price-to-capability option on the OpenRouter top ten.

StepFun and Second-Tier Providers

Step 3.5 Flash from StepFun (February 2, 2026) is the surprise performer outside the top four. A 196B MoE with 11B active per token, 262K context, $0.10 input / $0.30 output per million tokens in the paid tier and free on OpenRouter preview, Step 3.5 Flash sits at #3 on the OpenRouter free-model leaderboard at 1.38T weekly tokens. StepFun trained the model on NVIDIA Hopper rather than Ascend, producing up to 350 tokens per second at inference.

Below StepFun, a handful of second-tier providers serve specific niches:

ByteDance Seed 2.0 (Doubao) — China's most-used consumer AI app with 155M weekly active users. Pro variant matches GPT-5.2 at ~10x lower cost. Seed 2.0 Lite and Mini extend the family into ultra-cheap tiers ($0.10-$0.25 per million input).
Baidu ERNIE 5.0 — 2.4T-parameter omnimodal flagship, trained on Baidu Kunlun silicon. Integrated with Baidu's search engine for the dominant Chinese discovery surface.
KwaiKAT KAT-Coder-Pro V2 — Coding-specific Kuaishou model released March 27, 2026. 256K context, $0.30 / $1.20 per million, competitive on Chinese-language coding.
Baichuan and Yi — Active open-source and enterprise tiers in China, limited OpenRouter presence, positioned against Qwen and GLM for domestic buyers.
Xunfei (iFlytek) — Spark series, strong on speech, education, and public-sector workflows. Limited global-market ranking.
Tencent Hunyuan — Internal enterprise flagship, integrated across WeChat and Tencent Cloud. Minimal OpenRouter footprint by design.

Note that NVIDIA's Nemotron 3 Super 120B (released March 10-11, 2026, 60.47% SWE-Bench Verified, open source, 262K context) often appears in "Chinese AI" conversations because of its pricing profile, but it is an NVIDIA model trained and served outside China. Do not count it toward Chinese market share.

Pricing Comparison Matrix

The table below covers ten current flagship and volume models from the ten providers, sorted by provider. All figures are OpenRouter list prices as of April 2026 and cover input, output, and context for the default API variant.

Provider	Model	Input $/1M	Output $/1M	Context
Xiaomi	MiMo V2 Pro	$1.00	$3.00	1.04M
Xiaomi	MiMo V2 Flash	$0.09	$0.29	262K
Alibaba	Qwen 3.6 Plus	Free (preview)	Free (preview)	1M
Alibaba	Qwen 3.5 Flash	$0.065	$0.26	1M
Alibaba	Qwen 3 Max Thinking	$0.78	$3.90	262K
Z.ai (Zhipu)	GLM-5	$0.80	$2.56	200K
Z.ai (Zhipu)	GLM-5 Turbo	$1.20	$4.00	203K
DeepSeek	DeepSeek V3.2	Low	Low	Long
Moonshot	Kimi K2.5	$0.38	$1.72	262K
MiniMax	MiniMax M2.7	$0.30	$1.20	205K
MiniMax	MiniMax M2.5	$0.12	$0.99	197K
StepFun	Step 3.5 Flash	$0.10	$0.30	262K
ByteDance	Seed 2.0 Lite	$0.25	$2.00	262K

For reference, OpenAI GPT-5.4 sits at $2.50 / $15.00 with 1.05M context, Claude Sonnet 4.6 at $3.00 / $15.00 with 1M, and Claude Opus 4.6 at $5.00 / $25.00 with 1M. Against those anchors, the Chinese flagship tier undercuts US pricing by roughly 2.5-5x at input and 4-8x at output on comparable context lengths.

Capability Comparison Matrix

The capability matrix below scores flagship models across five workloads: coding, reasoning, tool use, multimodal inputs, and multilingual handling. Scores reflect the reference benchmarks cited in our provider breakdowns earlier in this report.

Model	Coding	Reasoning	Tool use	Multimodal	Multilingual
MiMo V2 Pro (Xiaomi)	Strong (25.5% coding share)	Solid	#1 OpenRouter tool calls	Text only	Chinese + English
MiMo V2 Omni (Xiaomi)	Solid	Solid	Strong	Image + video + audio	Chinese + English
Qwen 3.6 Plus (Alibaba)	Strong (23.5% coding share)	Always-on CoT	Native function calling	Text + limited image	Broad
Qwen 3.5-Omni (Alibaba)	Solid	Solid	Strong	Full omnimodal	113 languages speech
GLM-5 (Zhipu)	77.8% SWE-Verified	Strong	Solid	GLM-5V-Turbo variant	Chinese + English
DeepSeek V3.2	Strong	IMO/IOI gold	Thinking-in-Tool-Use	Text only	Chinese + English
Kimi K2.5 (Moonshot)	Cursor Composer 2 base	Strong	100-agent swarm	Multimodal MoE	Chinese + English
MiniMax M2.7	56.22% SWE-Pro	Self-evolving	Strong	Text primary	Chinese + English
Step 3.5 Flash (StepFun)	Solid	Solid	Solid	Text primary	Chinese + English
ERNIE 5.0 (Baidu)	Solid	Strong	Baidu search-native	Full omnimodal	Chinese + limited English

Three patterns stand out. First, the coding leaders are MiMo-V2-Pro and Qwen 3.6 Plus — combined they capture roughly 49% of all coding tokens on OpenRouter. Second, Qwen 3.5-Omni and ERNIE 5.0 are the most genuinely omnimodal flagships, with MiMo-V2-Omni close behind. Third, Chinese-language strength is universal but English-language tone quality varies more than headline benchmarks suggest — an important consideration for consumer-facing deployments targeting US audiences.

Enterprise Readiness and Export Controls

Pricing and capability tell only part of the procurement story. The enterprise-readiness matrix below covers compliance posture, US availability, primary hosting geography, and hardware stack. These are the dimensions that typically decide whether a model passes internal review at a US or EU-based buyer.

Provider	SOC 2	GDPR posture	US API access	Primary geography	Hardware
Xiaomi	Not published	Limited	Via OpenRouter	China	Mixed
Alibaba	Alibaba Cloud certs	EU regions available	Direct + OpenRouter	China + international	NVIDIA + mixed
Z.ai (Zhipu)	Enterprise program	Self-host path	Via OpenRouter + self-host	China	Huawei Ascend
DeepSeek	Not published	Self-host path	Via OpenRouter + self-host	China	NVIDIA today, Ascend (V4)
Moonshot	Not published	Self-host path	Via OpenRouter + self-host	China	Mixed
MiniMax	Not published	Limited	Via OpenRouter	China	Mixed
StepFun	Not published	Limited	Via OpenRouter	China	NVIDIA Hopper
ByteDance	Volcano Engine certs	EU tenants available	Direct + OpenRouter	China + international	Mixed
Baidu	Domestic certs	Limited	Baidu AI Cloud	China	Baidu Kunlun
Tencent	Tencent Cloud certs	EU tenants available	Tencent Cloud	China + international	Mixed

Compliance-first buyers

For regulated industries, the safer path is self-hosting open-weight models (GLM-5, Kimi K2.5, Qwen 3.5 small series) inside your own cloud region with documented data processing controls, rather than hitting Chinese-hosted APIs directly.

Hardware independence

Huawei Ascend-trained models (GLM-5, DeepSeek V4 when released) are the procurement story for Chinese state-owned buyers under US export controls. For Western buyers, the choice is neutral to slightly negative given the smaller tooling ecosystem.

Running a model evaluation? Production selection benefits from a structured scoring process across cost, capability, compliance, and workload fit. Our Analytics Insights and CRM Automation practices translate rankings like these into measurable production wins.

Conclusion: The Q2 2026 Playbook

Chinese AI crossed 45% of OpenRouter traffic because ten providers converged on a consistent playbook: ship flagship models with large context windows, price aggressively below US frontier, offer a free preview that AI IDE platforms adopt as a default backend, and keep a clear open-weight path for enterprise self-hosting. Xiaomi's 21.1% share is the most dramatic data point, but the pattern runs across Alibaba, Zhipu, MiniMax, DeepSeek, and StepFun.

The actionable takeaway for Western buyers is narrower than the headlines suggest. Chinese flagship models are genuinely cheaper and competitive on coding and long-context workloads. They lag on compliance posture, English-language tone, and enterprise tooling integrations. The sensible strategy in most production stacks is a multi-model architecture — US frontier for customer-facing English output and compliance-sensitive workflows, Chinese models for internal coding, batch processing, and cost-sensitive agent workloads, all routed through a cost-aware orchestration layer.

Translate the 2026 AI landscape into production wins

Picking the right mix of frontier and Chinese models across cost, compliance, and capability is where strategy meets engineering. We help teams route the right workload to the right provider.

Get Started Explore AI Digital Transformation

Free consultation

Expert guidance

Tailored solutions