AI Development12 min readMarket share reportQ2 2026

Chinese AI Models Q2 2026: 10-Provider Landscape Report

Q2 2026 market share report on Chinese AI providers — Qwen, GLM, DeepSeek, Kimi, MiniMax, Baichuan, and Yi. Usage data, licensing, and enterprise adoption.

Digital Applied Team
April 12, 2026
12 min read
10

Providers covered

45%+

Chinese share of OpenRouter

3x

Xiaomi vs OpenAI ratio

21.1%

Xiaomi share of tokens

Key Takeaways

Chinese share crossed 45% of OpenRouter traffic: One year ago, Chinese AI providers accounted for less than 2% of OpenRouter tokens. In April 2026, the combined share of Xiaomi, Alibaba, MiniMax, Zhipu, DeepSeek, and StepFun exceeds 45% of total weekly volume.
Xiaomi alone has 3x OpenAI's share: Xiaomi processes 4.21T weekly tokens on OpenRouter for a 21.1% market share, compared to OpenAI's 7.5%. MiMo-V2-Pro is the single most-used model on the platform by a wide margin.
Usage and intelligence are decoupling: The highest-volume Chinese models rank outside the intelligence top 10. Developers are optimizing for blended cost per token and specific capabilities like coding and long context, not raw benchmark leadership.
The field has consolidated to ten serious providers: Xiaomi, Alibaba, Zhipu, DeepSeek, Moonshot, MiniMax, StepFun, ByteDance, Baidu, and Tencent cover the meaningful share of Chinese AI output. Smaller labs from 2024 either merged, pivoted, or fell out of volume rankings.
Coding is the wedge that opened the market: MiMo-V2-Pro and Qwen 3.6 Plus together account for roughly 49% of all coding tokens on OpenRouter. Free-preview access, strong SWE-Bench numbers, and 1M-context windows made Chinese models the default for AI IDE tooling.
Huawei Ascend is the enterprise procurement story: GLM-5 and the upcoming DeepSeek V4 are trained and served on Chinese domestic silicon. For Chinese enterprises and state buyers, that hardware independence is worth more than a benchmark point.
Licensing favors open weights outside the flagship tier: GLM-5 ships under MIT, Qwen 3.5 flagship under an Apache-style commercial license, and Kimi K2.5 as open weights. Xiaomi and MiniMax keep flagship weights closed, matching the pattern established by OpenAI and Anthropic.

Chinese AI providers now serve over 45% of all OpenRouter traffic, up from less than 2% a year ago. Xiaomi alone has 3x OpenAI's share. This is the Q2 2026 landscape report: who the ten meaningful providers are, what they ship, how they price it, and how to evaluate them for production workloads.

The shift is not a benchmark story. Chinese models do not yet lead the Artificial Analysis Intelligence Index — MiMo-V2-Pro ranks #10 despite being the #1 model by usage. The shift is a cost, availability, and developer-choice story. Free-preview access, 1M token context windows, and per-token prices three to ten times below US frontier models have moved the default backend for AI coding IDEs, agent platforms, and cost-sensitive production workloads east.

Q2 2026 Landscape at a Glance

Ten providers cover essentially all meaningful Chinese AI output in Q2 2026. The group has consolidated sharply from the 2024 frontier when dozens of labs published competing checkpoints. Today the volume flows through Xiaomi, Alibaba, Z.ai (Zhipu), DeepSeek, Moonshot AI, MiniMax, StepFun, ByteDance, Baidu, and Tencent, with Baichuan, Yi, Xunfei, and KwaiKAT operating as second-tier niche players.

RankProviderWeekly tokensShareFlagship model
1Xiaomi4.21T21.1%MiMo-V2-Pro
2Alibaba (Qwen)2.77T13.9%Qwen 3.6 Plus
3MiniMax1.62T8.1%MiniMax M2.7
4Z.ai (Zhipu)1.12T5.6%GLM-5 / GLM-5 Turbo
5DeepSeek1.11T5.6%DeepSeek V3.2
6StepFun1.07T5.3%Step 3.5 Flash
7Moonshot AISub-rankTrackedKimi K2.5
8ByteDanceSub-rankTrackedSeed 2.0 (Doubao)
9BaiduDomesticDomesticERNIE 5.0
10TencentDomesticDomesticHunyuan (internal)

Share figures reflect OpenRouter weekly token volume at the provider level. Baidu, Tencent, and Moonshot concentrate usage on domestic Chinese surfaces and partner ecosystems rather than OpenRouter, so their global ranking understates their home-market presence. For Western buyers evaluating these providers, the OpenRouter data is still the most defensible apples-to-apples benchmark.

Xiaomi: The Phone Company Dominating AI Volume

Xiaomi is the story of this report. A consumer electronics company best known for smartphones and smart-home hardware holds 21.1% of OpenRouter weekly tokens, three times OpenAI's 7.5%. Xiaomi's AI lab shipped three frontier checkpoints between December 2025 and March 2026 under the MiMo brand, and each variant carved out a clear slot in the pricing curve.

MiMo V2 Pro
March 18, 2026 · Flagship

1.04M context, $1 input / $3 output per million tokens. The #1 model on OpenRouter at 4.79T weekly tokens and 25.5% of all coding traffic.

MiMo V2 Omni
March 18, 2026 · Omnimodal

262K context, $0.40 input / $2 output per million tokens. Unified image, video, and audio architecture in a single checkpoint.

MiMo V2 Flash
December 2025 · Ultra-cheap

262K context, $0.09 input / $0.29 output per million tokens. Top open-source claim on general reasoning at this price point.

For deeper technical coverage, see our MiMo-V2-Pro trillion-parameter release guide and the MiMo-V2-Omni omnimodal release guide.

Why Xiaomi won volume

Three decisions drove the rankings. First, MiMo-V2-Pro shipped on OpenRouter with a free preview tier that AI coding IDEs adopted as a default backend within weeks. Second, the 1.04M context window matched Qwen 3.6 Plus and exceeded most US frontier context ceilings, making MiMo the obvious choice for whole-repo refactors. Third, the pricing gap versus Claude Opus 4.6 is roughly 5x at input and 8x at output, which matters when agent frameworks expand token consumption by an order of magnitude.

Alibaba Qwen: The Scale Leader

Alibaba's Qwen family is the second-ranked provider at 13.9% share and 2.77T weekly tokens, but it is the broadest product line in Chinese AI. Between January 23 and April 2, 2026, Alibaba shipped six named Qwen releases covering coding, reasoning, ultra-cheap inference, omnimodal, and flagship tiers.

  • Qwen 3.6 Plus (Apr 2, 2026) — Flagship. 1M context, 65K output, always-on chain-of-thought, native function calling. Free during preview. Currently #2 on OpenRouter with 1.64T weekly tokens.
  • Qwen 3.5-Omni (Mar 30, 2026) — Native omnimodal. 256K context, 113 languages for speech recognition, Thinker-Talker architecture. Mostly closed-source.
  • Qwen 3.5 Flash (Feb 24, 2026) — Ultra-cheap high context. 1M tokens, $0.065 input / $0.26 output per million. Top pick for cost-sensitive batch workloads.
  • Qwen 3 Coder Next (Feb 3, 2026) — Coding-specific. 256K context, $0.12 input / $0.75 output per million. Purpose-built for IDE integrations.
  • Qwen 3 Max Thinking (Jan 23, 2026) — Reasoning variant. 262K context, $0.78 input / $3.90 output per million. Positioned against GPT-5.4 Pro and Opus 4.6 on hard problems.
  • Qwen 3.5 small series (Mar 2-3, 2026) — On-device. 0.8B to 9B parameters. The 9B variant beats several closed US models on GPQA Diamond.

For current flagship detail, see our Qwen 3.6 Plus 1M-context release guide.

Alibaba is also the provider most actively pushing Chinese AI into cross-border commerce and enterprise workflows through Alibaba Cloud International. For Western teams evaluating Chinese models under compliance constraints, Qwen's licensing terms and cloud footprint are typically the least painful starting point.

Zhipu GLM: Enterprise Enablement

Z.ai — the international brand for Zhipu AI — holds 5.6% OpenRouter share but commands an outsized share of Chinese domestic enterprise procurement. GLM-5 launched February 11, 2026 with 744B total parameters, 44B active per token, a Mixture-of-Experts architecture with 256 experts, and an MIT license. The model trains and serves end-to-end on Huawei Ascend silicon, making it the cleanest answer for Chinese state-owned and export-control-sensitive buyers.

GLM-5 at a glance
  • 744B total parameters, 44B active, 200K context window.
  • 77.8% SWE-bench Verified, competitive with Claude Sonnet 4.6.
  • $0.80 input / $2.56 output per million tokens on direct API, $1.20 / $4 on the GLM-5 Turbo faster variant.
  • MIT licensed weights — among the most permissive in the Chinese frontier tier.
  • GLM-5V-Turbo (April 1, 2026) adds 744B multimodal vision and agentic browsing benchmarks.

For the full architectural and benchmark breakdown, see our Zhipu GLM-5 744B MoE release analysis.

Zhipu's positioning is the clearest example of the bifurcation in Chinese AI. While Xiaomi and Alibaba optimize for global developer volume, Zhipu optimizes for Chinese enterprise deployment, domestic-hardware independence, and permissive licensing. For Western self-hosters and researchers, GLM-5's MIT license is often the deciding factor over the Qwen and Kimi licensing frameworks.

DeepSeek: Open-Weight Economics

DeepSeek holds 5.6% OpenRouter share through DeepSeek V3.2, a 685B parameter MoE model released December 2025 with DeepSeek Sparse Attention, gold-medal performance on the 2025 IMO and IOI, and Thinking-in-Tool-Use behavior. The Speciale variant reportedly surpasses GPT-5 on specific reasoning benchmarks, though direct head-to-head evaluations remain hard to replicate.

For a detailed walkthrough of the V3.2 release, see our DeepSeek V3.2 and Speciale complete guide.

DeepSeek's strategic value is open-weight economics. V3.2 is available for self-hosting, fine-tuning, and quantization with commercial license terms that many Western teams find acceptable. When the production decision comes down to "run the model inside my own cloud region or not," DeepSeek is usually on the shortlist alongside GLM-5 and Kimi K2.5.

Moonshot Kimi: Long-Context Agents

Moonshot AI's Kimi K2.5 launched January 27, 2026 with 1 trillion total parameters, 32B active per request, a 262K context window, and $0.38 input / $1.72 output per million tokens. The standout feature is Agent Swarm technology — the ability to coordinate up to 100 agents simultaneously inside a single inference loop. Moonshot claims K2.5 beats Claude Opus 4.5 on agentic benchmarks, a claim that holds up on several public harnesses.

Kimi K2.5 is also notable as the base model powering Cursor Composer 2, which scored 73.7% on SWE-bench Multilingual at launch. That makes K2.5 one of the few Chinese models actively embedded in a Western developer tool's production stack, rather than served only as a standalone API.

For the full agent swarm architecture breakdown, see our Kimi K2.5 agent swarm open-source guide.

MiniMax: Self-Evolving Agentic Workflows

MiniMax holds 8.1% OpenRouter share and is the provider most associated with agentic self-evolution. Its M-series shipped four numbered releases in under six months: M2, M2.1, M2.5 (February 12, 2026, 80.2% SWE-Bench Verified), and M2.7 (March 18, 2026, 56.22% SWE-Pro, 10B active parameters, roughly 50x cheaper than Claude Opus per comparable workload). MiniMax M2.7 is currently #4 on OpenRouter at 1.34T weekly tokens with +24% week-over-week growth.

For deeper coverage on the M2.7 release, see our MiniMax M2.7 agentic coding release guide.

MiniMax's positioning is the inverse of Xiaomi's. Rather than competing on raw context or pricing, MiniMax leans into self-evolving architectures — models that update internal representations across multi-step runs, enabling agents to refine strategies within a single task rather than requiring human feedback cycles. For agentic workloads with well-defined reward signals, M2.7 is often the best price-to-capability option on the OpenRouter top ten.

StepFun and Second-Tier Providers

Step 3.5 Flash from StepFun (February 2, 2026) is the surprise performer outside the top four. A 196B MoE with 11B active per token, 262K context, $0.10 input / $0.30 output per million tokens in the paid tier and free on OpenRouter preview, Step 3.5 Flash sits at #3 on the OpenRouter free-model leaderboard at 1.38T weekly tokens. StepFun trained the model on NVIDIA Hopper rather than Ascend, producing up to 350 tokens per second at inference.

Below StepFun, a handful of second-tier providers serve specific niches:

  • ByteDance Seed 2.0 (Doubao) — China's most-used consumer AI app with 155M weekly active users. Pro variant matches GPT-5.2 at ~10x lower cost. Seed 2.0 Lite and Mini extend the family into ultra-cheap tiers ($0.10-$0.25 per million input).
  • Baidu ERNIE 5.0 — 2.4T-parameter omnimodal flagship, trained on Baidu Kunlun silicon. Integrated with Baidu's search engine for the dominant Chinese discovery surface.
  • KwaiKAT KAT-Coder-Pro V2 — Coding-specific Kuaishou model released March 27, 2026. 256K context, $0.30 / $1.20 per million, competitive on Chinese-language coding.
  • Baichuan and Yi — Active open-source and enterprise tiers in China, limited OpenRouter presence, positioned against Qwen and GLM for domestic buyers.
  • Xunfei (iFlytek) — Spark series, strong on speech, education, and public-sector workflows. Limited global-market ranking.
  • Tencent Hunyuan — Internal enterprise flagship, integrated across WeChat and Tencent Cloud. Minimal OpenRouter footprint by design.

Note that NVIDIA's Nemotron 3 Super 120B (released March 10-11, 2026, 60.47% SWE-Bench Verified, open source, 262K context) often appears in "Chinese AI" conversations because of its pricing profile, but it is an NVIDIA model trained and served outside China. Do not count it toward Chinese market share.

Pricing Comparison Matrix

The table below covers ten current flagship and volume models from the ten providers, sorted by provider. All figures are OpenRouter list prices as of April 2026 and cover input, output, and context for the default API variant.

ProviderModelInput $/1MOutput $/1MContext
XiaomiMiMo V2 Pro$1.00$3.001.04M
XiaomiMiMo V2 Flash$0.09$0.29262K
AlibabaQwen 3.6 PlusFree (preview)Free (preview)1M
AlibabaQwen 3.5 Flash$0.065$0.261M
AlibabaQwen 3 Max Thinking$0.78$3.90262K
Z.ai (Zhipu)GLM-5$0.80$2.56200K
Z.ai (Zhipu)GLM-5 Turbo$1.20$4.00203K
DeepSeekDeepSeek V3.2LowLowLong
MoonshotKimi K2.5$0.38$1.72262K
MiniMaxMiniMax M2.7$0.30$1.20205K
MiniMaxMiniMax M2.5$0.12$0.99197K
StepFunStep 3.5 Flash$0.10$0.30262K
ByteDanceSeed 2.0 Lite$0.25$2.00262K

For reference, OpenAI GPT-5.4 sits at $2.50 / $15.00 with 1.05M context, Claude Sonnet 4.6 at $3.00 / $15.00 with 1M, and Claude Opus 4.6 at $5.00 / $25.00 with 1M. Against those anchors, the Chinese flagship tier undercuts US pricing by roughly 2.5-5x at input and 4-8x at output on comparable context lengths.

Capability Comparison Matrix

The capability matrix below scores flagship models across five workloads: coding, reasoning, tool use, multimodal inputs, and multilingual handling. Scores reflect the reference benchmarks cited in our provider breakdowns earlier in this report.

ModelCodingReasoningTool useMultimodalMultilingual
MiMo V2 Pro (Xiaomi)Strong (25.5% coding share)Solid#1 OpenRouter tool callsText onlyChinese + English
MiMo V2 Omni (Xiaomi)SolidSolidStrongImage + video + audioChinese + English
Qwen 3.6 Plus (Alibaba)Strong (23.5% coding share)Always-on CoTNative function callingText + limited imageBroad
Qwen 3.5-Omni (Alibaba)SolidSolidStrongFull omnimodal113 languages speech
GLM-5 (Zhipu)77.8% SWE-VerifiedStrongSolidGLM-5V-Turbo variantChinese + English
DeepSeek V3.2StrongIMO/IOI goldThinking-in-Tool-UseText onlyChinese + English
Kimi K2.5 (Moonshot)Cursor Composer 2 baseStrong100-agent swarmMultimodal MoEChinese + English
MiniMax M2.756.22% SWE-ProSelf-evolvingStrongText primaryChinese + English
Step 3.5 Flash (StepFun)SolidSolidSolidText primaryChinese + English
ERNIE 5.0 (Baidu)SolidStrongBaidu search-nativeFull omnimodalChinese + limited English

Three patterns stand out. First, the coding leaders are MiMo-V2-Pro and Qwen 3.6 Plus — combined they capture roughly 49% of all coding tokens on OpenRouter. Second, Qwen 3.5-Omni and ERNIE 5.0 are the most genuinely omnimodal flagships, with MiMo-V2-Omni close behind. Third, Chinese-language strength is universal but English-language tone quality varies more than headline benchmarks suggest — an important consideration for consumer-facing deployments targeting US audiences.

Enterprise Readiness and Export Controls

Pricing and capability tell only part of the procurement story. The enterprise-readiness matrix below covers compliance posture, US availability, primary hosting geography, and hardware stack. These are the dimensions that typically decide whether a model passes internal review at a US or EU-based buyer.

ProviderSOC 2GDPR postureUS API accessPrimary geographyHardware
XiaomiNot publishedLimitedVia OpenRouterChinaMixed
AlibabaAlibaba Cloud certsEU regions availableDirect + OpenRouterChina + internationalNVIDIA + mixed
Z.ai (Zhipu)Enterprise programSelf-host pathVia OpenRouter + self-hostChinaHuawei Ascend
DeepSeekNot publishedSelf-host pathVia OpenRouter + self-hostChinaNVIDIA today, Ascend (V4)
MoonshotNot publishedSelf-host pathVia OpenRouter + self-hostChinaMixed
MiniMaxNot publishedLimitedVia OpenRouterChinaMixed
StepFunNot publishedLimitedVia OpenRouterChinaNVIDIA Hopper
ByteDanceVolcano Engine certsEU tenants availableDirect + OpenRouterChina + internationalMixed
BaiduDomestic certsLimitedBaidu AI CloudChinaBaidu Kunlun
TencentTencent Cloud certsEU tenants availableTencent CloudChina + internationalMixed
Compliance-first buyers

For regulated industries, the safer path is self-hosting open-weight models (GLM-5, Kimi K2.5, Qwen 3.5 small series) inside your own cloud region with documented data processing controls, rather than hitting Chinese-hosted APIs directly.

Hardware independence

Huawei Ascend-trained models (GLM-5, DeepSeek V4 when released) are the procurement story for Chinese state-owned buyers under US export controls. For Western buyers, the choice is neutral to slightly negative given the smaller tooling ecosystem.

Conclusion: The Q2 2026 Playbook

Chinese AI crossed 45% of OpenRouter traffic because ten providers converged on a consistent playbook: ship flagship models with large context windows, price aggressively below US frontier, offer a free preview that AI IDE platforms adopt as a default backend, and keep a clear open-weight path for enterprise self-hosting. Xiaomi's 21.1% share is the most dramatic data point, but the pattern runs across Alibaba, Zhipu, MiniMax, DeepSeek, and StepFun.

The actionable takeaway for Western buyers is narrower than the headlines suggest. Chinese flagship models are genuinely cheaper and competitive on coding and long-context workloads. They lag on compliance posture, English-language tone, and enterprise tooling integrations. The sensible strategy in most production stacks is a multi-model architecture — US frontier for customer-facing English output and compliance-sensitive workflows, Chinese models for internal coding, batch processing, and cost-sensitive agent workloads, all routed through a cost-aware orchestration layer.

Translate the 2026 AI landscape into production wins

Picking the right mix of frontier and Chinese models across cost, compliance, and capability is where strategy meets engineering. We help teams route the right workload to the right provider.

Free consultation
Expert guidance
Tailored solutions

Frequently Asked Questions

Related Guides

Continue exploring...