AI DevelopmentBreaking10 min read

GLM-5 Released: 744B MoE Model vs GPT-5.2 & Claude Opus 4.5

Zhipu AI launches GLM-5 with 744B parameters, 200K context, and agentic intelligence — trained entirely on Huawei Ascend chips. Full technical analysis.

Digital Applied Team
February 11, 2026
10 min read
744B

Total Parameters

40B

Active Parameters

28.5T

Pre-training Data

200K

Context Window

Key Takeaways

744 billion parameters with only 40B active per inference: GLM-5 uses a Mixture of Experts architecture scaling from GLM-4.5's 355B/32B active to 744B/40B active, trained on 28.5T tokens — delivering frontier performance at efficient compute cost.
Top-ranked open-source model globally: GLM-5 leads open-source models on Vending Bench 2, BrowseComp, and MCP-Atlas. Scores 77.8% on SWE-bench Verified and approaches Claude Opus 4.5 across coding and agentic benchmarks.
Trained entirely on Huawei Ascend chips: Zero dependency on US-manufactured hardware. GLM-5 uses the MindSpore framework on Ascend chips — a milestone for China's self-reliant AI infrastructure.
Open-weight release now available under MIT license: GLM-5 weights are live on HuggingFace (zai-org/GLM-5), with API access via chat.z.ai and OpenRouter. Also available for Claude Code users via GLM Coding Plan.

Zhipu AI — the Tsinghua University spinoff that rebranded to Z.AI in 2025 and completed a landmark Hong Kong IPO in January 2026 — has officially released GLM-5, its fifth-generation large language model. With 744 billion total parameters (40B active), a 200K-token context window, and built-in agentic intelligence, GLM-5 is positioned as a direct challenger to OpenAI's GPT-5.2 and Anthropic's Claude Opus 4.5.

What makes this release strategically significant goes beyond raw capability numbers: GLM-5 was trained entirely on Huawei Ascend chips using the MindSpore framework, achieving full independence from US-manufactured semiconductor hardware. This is both a technical milestone and a geopolitical statement about the viability of China's domestic AI compute stack at frontier scale.

What Is GLM-5?

GLM-5 is the fifth generation of Zhipu AI's General Language Model series, representing a generational leap from the previous GLM-4.7 (released December 2025). The model is engineered for five core domains: creative writing, coding, advanced reasoning, agentic intelligence, and long-context processing.

Zhipu AI, founded in 2019, has rapidly established itself as a leader in open-source AI. The company's Hong Kong IPO on January 8, 2026, raised approximately HKD 4.35 billion (USD $558 million) — making it the first publicly traded foundation model company globally. That capital has directly accelerated GLM-5's development.

Technical Architecture

GLM-5 employs a Mixture of Experts (MoE) architecture, scaling from GLM-4.5's 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens:

Total Parameters

744 billion

Up from GLM-4.5's 355B (2.1x scale)

Active Parameters

40 billion

Per inference (up from 32B in GLM-4.5)

Expert Architecture

256 experts

8 activated per token

Attention Mechanism

DSA (Sparse)

DeepSeek's sparse attention for long context

The model incorporates DeepSeek's Dynamically Sparse Attention (DSA) mechanism for efficient long-context handling, enabling GLM-5 to process sequences up to 200,000 tokens without the computational overhead of traditional dense attention. Maximum output reaches 131,000 tokens — among the highest in the industry.

The GLM-5 family also includes specialized variants: GLM-Image for high-fidelity image generation using a hybrid auto-regressive and diffusion approach, and GLM-4.6V/4.5V for advanced multimodal reasoning that combines vision and language understanding.

Core Capabilities

Creative Writing

GLM-5 generates high-quality, nuanced creative content with stylistic versatility — from long-form narrative and technical documentation to marketing copy and academic prose. This is a noted improvement area over GLM-4.7.

Coding

A leap from vibe coding to agentic engineering. GLM-5 excels at systems engineering and full-stack development, scoring 77.8% on SWE-bench Verified (approaching Claude Opus 4.5's 80.9%). On CC-Bench-V2, GLM-5 hits 98% frontend build success rate and 74.8% end-to-end correctness.

Advanced Reasoning

Frontier-level multi-step logical reasoning with significantly reduced hallucinations. GLM-5 scores 50.4 on Humanity's Last Exam (with tools), 89.7 on τ²-Bench, and 75.9 on BrowseComp — #1 among all models tested on the latter.

Agentic Intelligence

A deep evolution from reasoning to delivery. GLM-5's Agent Mode (Beta) moves beyond conversation to a delivery-first paradigm — automatically decomposing tasks, orchestrating tools, and executing workflows to produce ready-to-use results.

  • Data Insights: Upload data, get instant charts (bar, line, pie) and analysis. Export as xlsx/csv/png.
  • Smart Writing: From outline to final draft with step-by-step control. Direct PDF/Word export.
  • Full-Stack Development: Enhanced instruction understanding and multi-step task execution for complex engineering.

Long-Context Processing

200K-token context window handles massive documents, entire codebases, research paper collections, and video transcripts in a single session. The 131K maximum output is among the industry's highest.

Huawei Ascend: Hardware Independence

GLM-5's training on Huawei Ascend chips is arguably as significant as the model itself. In the context of US export controls restricting advanced NVIDIA GPUs to China, Zhipu AI has demonstrated that frontier-scale AI training is achievable on domestic hardware.

Training Infrastructure
GLM-5's domestic training stack
  • Hardware: Huawei Ascend 910 series chips
  • Framework: MindSpore — Huawei's open-source deep learning framework
  • Scale: 744B parameters trained end-to-end without US hardware
  • Significance: First frontier-scale MoE model fully trained on non-NVIDIA hardware
  • Implication: Validates China's AI compute independence strategy

This aligns with China's broader push for semiconductor self-sufficiency, targeting substantial independence in data center chips by 2027. For the global AI industry, it signals that hardware diversity in AI training is not just possible — it's happening at frontier scale.

Benchmark Performance

GLM-5 has been evaluated across 8 major agentic, reasoning, and coding benchmarks. It ranks as the #1 open-source model globally — and is competitive with closed-source frontier models from OpenAI, Anthropic, and Google.

BenchmarkGLM-5Claude Opus 4.5GPT-5.2Gemini 3 Pro
Humanity's Last Exam50.4 (w/ Tools)43.4 (w/ Tools)45.8 (w/ Tools)45.5 (w/ Tools)
SWE-bench Verified77.8%80.9%76.2%80.0%
SWE-bench Multilingual73.3%77.5%65.0%72.0%
Terminal-Bench 2.056.2%59.3%54.2%54.0%
BrowseComp75.9 🥇67.859.265.8
MCP-Atlas67.865.266.668.0
τ²-Bench89.791.690.785.5
Vending Bench 2$4,432 🥇 OS$4,967$5,478$3,591

🥇 = highest score among all models. 🥇 OS = highest among open-source models. Source: Zhipu AI official benchmarks.

CC-Bench-V2: Internal Engineering Evaluation
GLM-5 vs Claude Opus 4.5 on real-world engineering tasks

Frontend

Build Success: 98.0% vs 93.0%

E2E Correctness: 74.8% vs 75.7%

Backend

E2E Correctness: 25.8% vs 26.9%

Long-horizon

Large Repo: 65.6% vs 64.5%

Multi-Step: 52.3% vs 61.6%

Bold = higher score. GLM-5 narrows the gap with Claude Opus 4.5 significantly, especially in frontend (+26% build success rate improvement over GLM-4.7).

The model previously appeared on OpenRouter as "Pony Alpha" in early February 2026 — a stealth test that the AI community quickly identified through benchmark analysis and GitHub pull requests. Zhipu AI has since confirmed the connection, and GLM-5 is now officially listed on OpenRouter at openrouter.ai/z-ai/glm-5.

GLM-5 vs GPT-5.2 vs Claude Opus 4.5

Architecture

GLM-5

744B MoE, 40B active

GPT-5.2

Undisclosed, dense transformer

Claude Opus 4.5

Undisclosed, dense transformer

Context Window

GLM-5

200K tokens

GPT-5.2

400K tokens

Claude Opus 4.5

200K (1M beta)

Pricing (Input)

GLM-5

~$0.11/M tokens (est.)

GPT-5.2

$1.75/M tokens

Claude Opus 4.5

$5.00/M tokens

Open Source

GLM-5

MIT license (weights on HuggingFace)

GPT-5.2

Closed source

Claude Opus 4.5

Closed source

Training Hardware

GLM-5

Huawei Ascend (fully domestic)

GPT-5.2

NVIDIA H100/B200

Claude Opus 4.5

NVIDIA/Google TPU

GLM-5's competitive advantages center on pricing and openness: it offers frontier-level capability at a fraction of GPT-5.2's cost, with open weights already available under MIT license. It leads all open-source models on Vending Bench 2 ($4,432) and BrowseComp (75.9), while approaching Claude Opus 4.5 on coding benchmarks like SWE-bench Verified (77.8% vs 80.9%). For a broader look at how Chinese AI labs are competing at the frontier, see our Chinese AI models comparison.

Pricing and Access

Cost efficiency has been a consistent advantage of the GLM series. GLM-5 is available through multiple channels — from direct chat to open weights to a dedicated coding subscription plan.

ModelInput PriceOutput PriceOpen Weights
GLM-4.5 (current)$0.35/M$1.55/MYes (Hugging Face)
GLM-5~$0.11/MTBDYes (HuggingFace)
GPT-5.2$1.75/M$14.00/MNo
Claude Opus 4.5$5.00/M$25.00/MNo
Claude Sonnet 4.5$3.00/M$15.00/MNo

Access Channels

GLM Coding Plan

Z.AI offers a dedicated GLM Coding Plan — a subscription package designed specifically for AI-powered coding. It provides access to GLM models across mainstream coding tools at a fraction of standard API pricing.

Lite
$10 / month

3× usage of the Claude Pro plan

  • Managing lightweight workloads
  • Only supports GLM-4.7 and historical text models
  • Compatible with 20+ coding tools including Claude Code, Cursor, Cline, Kilo Code
Pro
Popular
$30 / month

5× Lite plan usage

  • Managing complex workloads
  • All Lite plan benefits
  • 40–60% faster compared to Lite
  • Access Vision Analyze, Web Search, Web Reader, and Zread MCP
Max
Max Usage
$80 / month

4× Pro plan usage

  • Managing high-volume workloads for advanced developers
  • All Pro plan benefits
  • Supports the latest flagship GLM-5
  • Guaranteed peak-hour performance
  • Early access to new features

Quarterly billing saves 10%. Yearly billing saves 30%. Each prompt typically allows 15–20 model calls, giving a total monthly allowance of tens of billions of tokens — all at ~1% of standard API pricing.

Supported Coding Tools
The GLM Coding Plan works with all major AI coding tools
Claude CodeRoo CodeClineOpenCodeOpenClawKilo CodeCrushGoose

Default model mapping (Claude Code):

  • ANTHROPIC_DEFAULT_OPUS_MODEL: GLM-4.7
  • ANTHROPIC_DEFAULT_SONNET_MODEL: GLM-4.7
  • ANTHROPIC_DEFAULT_HAIKU_MODEL: GLM-4.5-Air

Modify ~/.claude/settings.json to switch to GLM-5 or other models. GLM-5 access requires Max plan.

Key Advantages
  • Speed: Over 55 tokens per second for real-time interaction
  • No restrictions: No network barriers or account bans — just smooth, uninterrupted coding
  • Expanded capabilities: All plans support Vision Understanding, Web Search MCP, and Web Reader MCP
  • Data privacy: All Z.AI services are based in Singapore. No user content is stored — text prompts, images, and input data are never retained

Industry Implications

GLM-5's release carries significant implications across several dimensions:

Price Competition

GLM-5's open-weight pricing puts enormous pressure on OpenAI and Anthropic. At ~$0.11/M input tokens, it undercuts GPT-5.2 by 16x and Claude Opus 4.5 by 45x.

Hardware Diversity

Proof that frontier AI training does not require NVIDIA hardware. This opens the door for other chip manufacturers and reduces single-vendor dependency risk.

Open Source Momentum

GLM-5 is now available under MIT license on HuggingFace — a frontier-scale MoE model that accelerates open-source AI research and enables smaller organizations to deploy competitive AI capabilities.

Geopolitical Shift

China demonstrating frontier AI capability on domestic hardware reshapes the global AI power balance and has implications for export control policy effectiveness.

Conclusion

GLM-5 is not just another model release — it is a statement about the decentralization of frontier AI capability. A 744-billion-parameter MoE model, trained entirely on domestic hardware, scoring #1 among open-source models on multiple benchmarks, and released under MIT license on HuggingFace. Whether or not GLM-5 is the "GPT-5 killer" its positioning suggests, it demonstrates that the era of frontier AI being exclusively a US-company capability is decisively over.

For developers and businesses evaluating LLM options, GLM-5 deserves serious attention — especially for cost-sensitive applications, agentic workflows, and organizations that prefer open-weight models they can host and fine-tune independently. For a comprehensive overview of how today's frontier models compare, explore our GPT vs Claude vs Gemini vs Grok comparison.

Navigate the LLM Landscape

Expert guidance on model selection, deployment strategy, and AI integration for your business.

Free consultation
Model evaluation
Fast deployment

Frequently Asked Questions

Related AI Analysis

More on the evolving AI model landscape