ZCode is Z.ai’s free desktop agentic development environment for GLM-5.2 — a downloadable app that wraps the model in agent chat, a file manager, a terminal, a Git panel, and a live browser preview, then layers on Goal Mode, custom subagents, phone remote control, and chat-app bots. It officially launched the week of July 1, 2026, and it is the most complete answer yet to a simple question: what does a coding agent look like when the model vendor builds the whole cockpit?
The stakes are real. GLM-5.2 arrived in mid-June as the strongest open-weight coding model to date, but a model checkpoint is not a product. ZCode is Z.ai’s attempt to convert benchmark attention into daily seats — priced from $18 a month at list, with a 5-day free trial and a quota promotion that expires July 31, 2026.
This guide covers what ZCode actually is, a dated ship map of the launch-week releases, how Goal Mode and subagents work, remote control and bot channels, BYOK and MCP plumbing, the full pricing picture including what changes on August 1, and the honest limitations that early adopters are reporting.
- 01ZCode is a free desktop ADE, not a CLI.An Electron app for macOS, Windows, and Linux (beta) bundling agent chat, file manager, terminal, Git panel, and live browser preview. Z.ai launched it the week of July 1, 2026 as the official harness for GLM-5.2.
- 02Goal Mode auto-iterates to a stated objective.Set a goal with /goal and the agent works in rounds, automatically verifying whether the objective is met at the end of each iteration — no manual 'continue' prompts. It only summarizes and stops once verification passes.
- 03Custom subagents get per-agent models and permissions.Beyond the built-in general-purpose and read-only Explore agents, v3.2.0 (Jun 29) added custom subagents — each with its own model, tool permissions, and system prompt, stored as Markdown in ~/.zcode/agents/.
- 04Free for 5 days, then a plan or a key.The trial grants 5M tokens/day (GLM-5.2 3M + GLM-5-turbo 2M). After that, agent features need a GLM Coding Plan — $18/$72/$160 a month at list — or a pay-as-you-go API key. A ~1.5x quota promo runs through July 31, 2026.
- 05Reception is genuinely mixed — and that matters.GLM-5.2 is near-frontier on many single-shot coding benchmarks at a fraction of the cost, but it trails Claude Opus 4.8 on sustained long-horizon agent work, and the ZCode harness itself is closed-source — a sticking point in developer forums.
01 — The ProductA full cockpit for GLM-5.2, not another CLI.
Z.ai calls ZCode an “Agentic Development Environment” — an ADE rather than an IDE. The distinction is deliberate. Where a Cursor or VS Code fork puts the editor first and bolts an agent onto it, ZCode puts the agent conversation at the center and arranges everything the agent touches around it: a file manager, a terminal, a Git panel, and a live browser preview, all in one free Electron app. The site’s own tagline reads “Simple, Fast, Vibe‑Ready | Official Harness for GLM-5.2.”
The launch framing needs one honest correction. Press coverage — including VentureBeat’s July 2 piece — reports that Z.ai officially launched ZCode on the Wednesday of that week, with the company announcing “Introducing ZCode, the official development environment for GLM-5.2” on X. But the public changelog shows the app iterating in public since at least June 26. The accurate framing is that ZCode launched the week of July 1, 2026 — not on a single hard date.
The model underneath is GLM-5.2, announced June 13 with open weights following under an MIT license on June 16 — a 753B-parameter Mixture-of-Experts model (per the Hugging Face model card), reportedly around 40B active per token, with a 1M-token context window. ZCode is how Z.ai wants you to consume it.
5-day out-of-box trial
First-time users get 5 days of free usage with no setup: 3M GLM-5.2 tokens plus 2M GLM-5-turbo tokens per day. Enough to run the agent against a real repository and form a judgment.
Released July 3, 2026
The public changelog shows eight point releases between June 26 and July 3 — near-daily iteration. macOS (Apple Silicon + Intel), Windows (64-bit + ARM64), and Linux (x64 + ARM64, labeled Beta) all track the same version.
GLM-5.2 MoE under MIT
Open weights on Hugging Face, 1M-token context, and API list pricing well below closed-frontier rates. ZCode is the official harness, but the model itself remains usable anywhere.
Why is Z.ai giving the harness away? VentureBeat reports the company’s on-premises deployment revenue reached RMB 534 million in FY2025 — more than 100% year-over-year growth and roughly three-quarters of total revenue — and frames ZCode plus the GLM Coding Plan as the vehicle for extending that base globally. The same report, citing South China Morning Post, notes Zhipu’s market capitalization crossed HK$1 trillion (about US$128 billion) on June 22 after GLM-5.2’s release. None of those figures are ours to verify — but they explain the ambition: ZCode is a distribution play, and the subscription is the product.
02 — Ship MapWhat shipped when: the launch-week changelog, dated.
Most coverage says “ZCode launched this week” and stops there. The public changelog tells a more useful story: the headline features were already live before the press cycle, and the launch week itself was mostly hardening. The table below maps every public release from June 26 through July 3 to what it actually shipped — worth scanning before you assume a feature is mature.
| Version | Date | What shipped | Type |
|---|---|---|---|
| Before the press coverage · Jun 26–30 | |||
| v3.1.6–v3.1.8 | Jun 26 | Feedback-verification skill; plan-mode exit fix — proof the app was iterating well before launch headlines | Point fixes |
| v3.2.0 | Jun 29 | Custom subagents with per-agent permissions and models; plugin management introduced (Beta) | Feature (Beta) |
| v3.2.1 | Jun 30 | Remote-workspace session-restore fix; Mermaid-diagram crash fix; Knowledge Base generation fixes | Fixes |
| Launch week · Jul 1–3 | |||
| v3.2.2 | Jul 1 | Plugin page gains update/uninstall for built-in plugins; file rewind adds a “safety summary” against accidental actions | Feature |
| v3.2.3 | Jul 3 | MCP servers trusted by default in workspaces; new $zcode-configuration-guide skill; Anthropic API-key auth fix; Linux auto-update fix | Feature + fixes |
| v3.2.4 | Jul 3 | Skills restored to the slash menu; improved request-retry handling | Fixes |
Two readings follow. First, the velocity is genuinely impressive — eight public releases in eight days, including two on July 3 alone. Second, several marquee capabilities are explicitly young: custom subagents and plugin management both carry Beta labels and shipped days before the press cycle. Goal Mode, Remote Control, and the Bot Channel were already documented by launch week, but treat the newest surfaces as v1 software, because that is what the changelog says they are.
03 — Goal ModeSet an objective, and the agent verifies itself each round.
Before Goal Mode, know the baseline: ZCode’s agent runs in one of five execution modes, cycled with Shift+Tab, that trade autonomy against confirmation friction.
Default Mode
The standard balance of forward progress and check-ins. The right starting point while you calibrate how much rope the agent deserves on your codebase.
Confirm Before Changes
Nothing touches disk or shell without an explicit yes. The mode for critical or production-adjacent work where a stray edit is expensive.
Auto Edit
File modifications proceed automatically while shell commands still ask first — a pragmatic middle ground for trusted repositories.
Plan Mode
The agent drafts a plan and waits for confirmation before implementing. Built for complex multi-step tasks where you want to see the shape of the work first.
Full Access
Minimal confirmation friction for clear, lower-risk tasks. Combine with Goal Mode when you want the agent to run a well-scoped objective to completion.
Goal Mode sits on top of those modes and changes the loop itself. You declare an objective with /goal <objective> — and can steer mid-flight with /goal replace, /goal pause, /goal resume, and /goal clear. From there the agent works in rounds: at the end of each iteration it performs automatic goal verification, and if the objective isn’t met, it starts another round without waiting for a manual “continue.” The task only wraps up and summarizes once verification passes. A summary panel tracks goal status, elapsed time, total tokens consumed, and iteration count while it runs.
The official docs pitch it for objectives with checkable end states — their own examples are “Refactor the whole module and keep the tests passing,” “Fix all TypeScript compile errors,” and “Get this page’s Lighthouse performance score above 90.” One caveat worth stating plainly: the docs do not document how verification works internally, so treat it as the agent checking its own work rather than an independent referee — and write goals whose success criteria a machine can actually check.
"the clearer the success criteria, the more accurate each round's verification"— ZCode documentation, Goal Mode guide
04 — SubagentsTwo built-ins, plus custom agents with their own models.
Subagents are ZCode’s delegation primitive: the primary agent can spin off scoped workers, several of which run in parallel. Two are built in, and since v3.2.0 (June 29) you can define your own — the changelog describes it as adding “a generic sub-intelligent agent, supporting custom read and write permissions and models.”
general-purpose
The do-anything worker: implementation, file organization, verification passes, and parallel workstreams. Cannot be edited, deleted, or renamed.
Explore
Code search, call-chain mapping, architecture discovery, and pre-change risk research. It cannot create, modify, move, or delete files — a hard guarantee, not a convention.
Custom subagents
Configure a name, a color, a model (inherit the default or pin a specific one), a description the primary agent uses to auto-select it, per-tool read/write permissions, and a system prompt.
Custom subagents are stored as plain Markdown files at ~/.zcode/agents/and invoked two ways: automatically, when the primary agent matches a task to a subagent’s description, or explicitly with @name in chat. The per-agent model field is the quietly powerful part — you can pin a cheaper model to a read-only research agent and reserve your GLM-5.2 quota for the agent that writes code.
Know the current limits before building a workflow around them: subagents are user-level only (no per-project definitions yet), the two built-ins are immutable and their names reserved, and execution is foreground-only — parallel subagents run concurrently, but the primary task blocks until they all finish. No background delegation yet, per the subagents documentation.
05 — Remote & BotsSteer a running agent from your phone.
Two features extend the agent beyond the desktop. Remote Control is the simpler one: click the phone icon in ZCode’s lower-left sidebar to generate a QR code and link, scan it, and your phone pairs to the currently open desktop workspace. The desktop remains the actual runtime — code changes, commands, and the project environment stay on your machine — while the phone checks status, sends instructions, approves, resumes, or stops runs, and can forward a screenshot into the session. The docs state the constraint plainly: “only one phone page can be connected at a time, and the phone can only access workspaces that are already open on the desktop.”
The Bot Channel goes further, pairing ZCode with a messaging app so you can drive it conversationally. The Feishu flow: choose Feishu in ZCode, scan a QR code, let ZCode auto-create a Feishu app and generate a pairing code, then send /bind <pairing-code> in the conversation. From the bot you can check status, create tasks, switch project or model, change run mode, and adjust reply detail.
06 — BYOK & MCPBring your own keys, import your MCP servers.
ZCode is opinionated about GLM-5.2 but not locked to it. Beyond the Z.ai/BigModel plan and API-key paths, the configuration docs give exact setup steps for five named third-party providers, plus a generic “Add Provider” path for any OpenAI- or Anthropic-protocol-compatible endpoint. One independent review notes the connectable-model list explicitly names Kimi K2.5 and DeepSeek alongside Claude and GPT.
| Connection path | Documented endpoint | What it's for |
|---|---|---|
| GLM Coding Plan (Z.ai / BigModel) | /api/coding/paas/v4 — coding-only | Subscription quota — the default path. Must route through the coding-only endpoint; see the callout below. |
| Z.ai / BigModel API key | /api/paas/v4 — general pay-as-you-go | Metered usage without a subscription. Not interchangeable with the coding-only endpoint. |
| Anthropic | api.anthropic.com | Run Claude models inside ZCode with your own Anthropic key. |
| OpenRouter | openrouter.ai/api | Aggregated multi-model access through one key. |
| Moonshot (Kimi) | api.moonshot.cn/anthropic | Kimi models via Moonshot's Anthropic-compatible endpoint. |
| OpenAI | api.openai.com | GPT models with your own OpenAI key. |
| MiniMax | api.minimaxi.com/anthropic | MiniMax models via its Anthropic-compatible endpoint. |
| Add Provider (custom) | any compatible endpoint | Self-hosted gateways or any OpenAI/Anthropic-protocol-compatible service. |
/api/coding/paas/v4), which is explicitly not interchangeable with the general pay-as-you-go endpoint (/api/paas/v4). Point a plan at the wrong one and quota billing silently breaks — no error, just a plan you’re paying for and not drawing down.MCP support is equally practical. Servers over stdio, HTTP, or SSE are managed under Settings → MCP Servers, split into “Configured” (user-added) and “Plugin” (bundled with an installed plugin) groups, and you can paste a full JSON config in either the bare server-name shape or the mcpServers wrapper. Better: ZCode one-click imports existing MCP configs from Claude Code (~/.claude/settings.json), Codex CLI (~/.codex/config.toml), OpenCode, and a generic ~/.agents/mcp.json — leaving the originals untouched. Z.ai also recommends three first-party servers (zai-mcp-server for visual understanding, web-search-prime, and web-reader), which require a Zhipu API token.
Project instructions follow the emerging convention: ZCode reads a global ~/.zcode/AGENTS.md plus a workspace-root AGENTS.md, appending global then workspace. Note for Claude Code users migrating a repo: CLAUDE.md is read once, during onboarding, as a migration source into AGENTS.md— it is not continuously read at runtime, so updates to it won’t reach ZCode’s agent.
07 — Pricing & QuotasThe trial, the plans, and the July 31 quota cliff.
The pricing structure has three layers, and conflating them is the most common mistake in early coverage. Layer one: the app is free, and first-time users get a 5-day trial with 5M tokens a day (3M of GLM-5.2, 2M of GLM-5-turbo), ready out of the box. Layer two: after the trial, agent features require either a GLM Coding Plan — Lite at $18, Pro at $72 (5x Lite usage), Max at $160 (20x Lite usage) a month at list — or a pay-as-you-go API key. The zcode.z.ai site currently displays 10%-discounted introductory pricing ($16.20 / $64.80 / $144), with a note that final details live on z.ai. Layer three: a ZCode-specific quota promotion, separate from any price discount, runs through July 31, 2026.
That promo is worth understanding precisely, because it expires with a date attached. Per the vendor’s published terms, Coding Plan subscribers get roughly 1.5x effective GLM-5.2 quota inside ZCode: during peak hours (14:00–18:00 daily), usage is metered at 2x instead of the standard 3x, and during the other 20 hours it is metered at 0.67x instead of 1x. On August 1 those coefficients revert. Using Z.ai’s approximate per-tier base allowances (~80 / ~400 / ~1,600 prompts per rolling 5-hour window at 1x metering), here is what that actually means:
| Tier (list) | Off-peak, promo (0.67x) | Peak, promo (2x) | Off-peak, Aug 1+ (1x) | Peak, Aug 1+ (3x) |
|---|---|---|---|---|
| Lite — $18/mo · base ~80 | ~119 | ~40 | ~80 | ~27 |
| Pro — $72/mo · base ~400 | ~597 | ~200 | ~400 | ~133 |
| Max — $160/mo · base ~1,600 | ~2,388 | ~800 | ~1,600 | ~533 |
Read the columns, not the headline. Cells are simply the tier’s approximate base allowance divided by the metering coefficient, so treat them as directional rather than contractual — the base figures are Z.ai’s own approximations and prompts fan out into many model calls. But the shape is unambiguous: an off-peak-heavy subscriber loses about a third of their effective allowance on August 1 (~119 → ~80 on Lite), while a peak-hours subscriber’s throughput drops by a third from the promo rate (~40 → ~27) and lands at roughly a third of what the same window yields off-peak after reversion (~27 vs ~80). If you subscribe during the promo, budget against the right-hand columns — that is the plan you will actually own in August.
The pay-as-you-go alternative prices GLM-5.2 at $1.40 per million input tokens ($0.26 cached) and $4.40 per million output tokens on Z.ai’s published rates. VentureBeat frames that as a cost reduction of up to 82% versus Claude Opus 4.8’s $5 and $25 — vendor list rates as that outlet cites them:
Published API list pricing · GLM-5.2 vs Claude Opus 4.8
Source: Z.ai published API rates; Opus 4.8 rates as cited by VentureBeat, Jul 2, 2026Demand appears real — VentureBeat quotes one X user (translated from Chinese) needling Z.ai: “Bro, can’t snag your family’s Coding Plan? When are you gonna stock up on more cards?” Whether the plan pencils out for your workload is a longer conversation — we run the full per-dollar math in our GLM Coding Plan value analysis.
If the 5-day trial convinces you, subscribing to the GLM Coding Plan through our link supports our testing work. Referral link: we earn Z.ai platform credits if you subscribe, and new Z.ai accounts get 10% off their first subscription order. That discount applies only to a new account’s first order, isn’t stackable with other offers, and requires completing payment within 72 hours of clicking through.
08 — Honest LimitationsWhat early adopters actually say.
The Hacker News reception was substantive and genuinely mixed — which is more useful than either hype or dismissal. The praise clusters around economics and capability-per-dollar: GLM-5.2 lands near-frontier on many single-shot coding benchmarks at a fraction of the cost of closed alternatives. The criticism clusters around three things worth weighing before you commit a team.
The harness is closed-source.The GLM-5.2 weights are MIT-licensed and public, but ZCode itself is not — and for an app with file, terminal, and Git access, that asymmetry bothered many commenters. One argued a closed-source agent system from a Chinese vendor is “essentially a black box with full user permissions,” pointing to Chinese national-security laws that obligate companies to cooperate with state intelligence efforts. Whatever your own threat model, enterprises with strict data-residency requirements will need answers ZCode’s docs don’t currently give.
Speed is uneven, especially at peak. One commenter in the launch thread reported the agent “Stalled for 40 minutes on trivial tool calls like find, two times” while allowing that “It shows potential, answer/code quality was solid.” The 3x-metered peak window in the pricing section is Z.ai’s own acknowledgment that capacity is contended between 14:00 and 18:00.
Rough international edges.A second HN thread was dominated by confusion over the Linux Beta signup flow routing through Feishu’s Chinese-only download page with no visible language toggle — a small thing, but read by several commenters as a sign of how young Z.ai’s international readiness is.
"GLM 5.2 is in an uncanny valley where it's too big to run at home, too expensive and slow in comparison to similarly capable model... it doesn't have a solid place in this marketplace right now."— Hacker News commenter, ZCode launch thread
On capability, the honest benchmark framing matters more than any single number. GLM-5.2 is near-frontier on many single-shot coding benchmarks at a fraction of the cost — and it trails Claude Opus 4.8 on sustained long-horizon agent work, the multi-hour autonomous sessions where frontier models earn their premium. VentureBeat’s claim that GLM-5.2 sits within a point of Opus 4.8 on one headline benchmark is that outlet’s characterization; the fuller per-benchmark picture is more mixed. We break it down eval-by-eval in our GLM-5.2 vs Claude Opus benchmark analysis.
Step back and the trend is bigger than this product: the frontier labs have concluded that the harness, not just the model, is the durable surface. Anthropic ships Claude Code, OpenAI ships Codex, and now Z.ai ships ZCode — each vendor betting that owning the environment where developers spend their day is worth more than winning any single benchmark cycle. ZCode is the first of these to lead with a full desktop GUI rather than a terminal, which tells you whom Z.ai is really courting: the much larger population of developers who never fell in love with a CLI.
09 — VerdictWho should use ZCode today.
The decision is cleaner than the feature list suggests. Four profiles, four answers:
The trial costs you nothing
Five days and 5M tokens a day against your own repository is a real evaluation, not a demo. If the agent holds up, Lite at $18 list is the cheapest serious coding-agent subscription on the market — just budget against the post-July-31 quota columns, not the promo.
Frontier still earns its premium
For sustained multi-hour autonomous sessions — large refactors, migration campaigns, agentic pipelines that run unattended — Claude Opus 4.8 remains ahead, and the closed-frontier harnesses are more battle-tested. Cheap tokens don't help if the agent loses the plot at hour three.
A free BYOK cockpit
Even if you never subscribe, ZCode's BYOK support (Anthropic, OpenAI, OpenRouter, Moonshot, MiniMax, plus custom endpoints) and one-click MCP import make it a credible free harness for models you already pay for elsewhere.
The questions aren't answered yet
A closed-source harness with full workspace permissions, unresolved data-residency questions, and no presence in Gartner's 2026 coding-agent evaluation is a hard sell to a security review today. Watch it — the velocity is real — but don't lead with it.
If you sit between profiles, sequence it: run the trial, keep your existing harness for the work that pays, and route by task — the same discipline we recommend in our model-routing guide and apply when choosing between Claude tiers. For a head-to-head on the two harnesses specifically, see ZCode vs Claude Code. And if your team wants help building an evaluation that reflects your actual codebase rather than benchmark folklore, that is exactly what our AI transformation engagements start with.
10 — ConclusionThe harness war just gained a serious third player.
ZCode makes GLM-5.2 feel like a product, not a checkpoint.
Strip the launch noise and ZCode is a coherent bet: take the most credible open-weight coding model, wrap it in a free desktop environment that a non-CLI developer can actually love, add real differentiators — Goal Mode’s automatic verification loop, per-agent models on custom subagents, phone remote control — and price the subscription at a fraction of the closed-frontier alternatives. The eight-releases-in-eight-days changelog says the team behind it is shipping, not just announcing.
The honest counterweight is equally clear. The harness is closed-source, the trust questions are unresolved for regulated teams, peak-hour speed is contended enough that Z.ai meters it at 3x, and for sustained long-horizon agent work Claude Opus 4.8 remains the stronger tool. Near-frontier at a fraction of the cost is a real value proposition — it is not the same proposition as frontier.
Looking forward, watch three dates and one pattern. July 31: the quota promo expires, and the right-hand columns of our cliff table become the real product. The “later versions” promise: DingTalk, Discord, and WeCom bot channels will signal whether Z.ai’s international ambition is roadmap or rhetoric. The plugin marketplace: whether third parties actually build for it decides if ZCode becomes a platform. And the pattern — model vendors shipping their own harnesses — is now firmly established. Expect the next competitive cycle to be fought over ergonomics, quotas, and trust as much as over benchmark tables.