What Happened: The npm Source Map Leak

On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic had shipped version 2.1.88 of the @anthropic-ai/claude-code npm package with a 59.8 MB source map file attached. Source maps are debugging files that connect bundled JavaScript back to the original source code. Including one in a public npm package is the equivalent of publishing your entire private repository.

Within hours, snapshots of the code were mirrored to GitHub repositories that accumulated over 41,500 forks before Anthropic could respond. The company confirmed this was a "release packaging issue caused by human error, not a security breach" — a single misconfigured .npmignore or files field in package.json was the root cause.

Timeline Note: This was Anthropic's second data exposure in under a week. Five days earlier, on March 26, a misconfigured content management system had leaked nearly 3,000 internal documents about the unreleased Claude Mythos model. Together, these incidents raised significant questions about Anthropic's operational security practices.

What made this leak particularly consequential was not just the volume of code — 512,000 lines across 1,900 TypeScript files — but what it revealed about how modern AI coding assistants actually work. The source code exposed the full agentic harness: the orchestration layer that transforms a language model into a capable software engineering tool.

What Was Exposed

512,000 lines of TypeScript source code
44 compile-time feature flags
26 hidden slash commands
120+ undocumented environment variables

Root Cause

Missing .npmignore configuration
59.8 MB source map included in package
Human error in release pipeline
No pre-publish content validation

The Agentic Harness Architecture

The most significant revelation from the leak was the architecture of what Anthropic calls the agentic harness — the software layer that wraps the language model and tells it how to use tools, enforce safety guardrails, and orchestrate complex tasks. This is the layer that transforms a general-purpose LLM into a capable coding assistant.

For anyone building AI-powered tools and agents, understanding this architecture is invaluable. The harness is what makes Claude Code Claude Code — not just the underlying model.

Core Architectural Components

The leaked codebase revealed a three-layer architecture. At the center sits QueryEngine.ts, a massive 46,000-line file handling all LLM API calls, streaming, caching, and orchestration. Around it, a base tool definition layer spans approximately 29,000 lines, encompassing schema validation, permission enforcement, and error handling. The outer layer consists of approximately 40 tools in a modular plugin architecture that the harness registers and manages.

Query Engine

46,000-line orchestration core handling LLM calls, streaming, caching, and the tool execution loop

Tool Framework

29,000-line base definition layer with schema validation, permission enforcement, and error handling

Plugin System

~40 modular tools in a plugin architecture with hooks and MCP server integration

The architecture also uses React and Ink for terminal rendering, employing game-engine techniques to deliver the responsive, real-time interface that developers have come to expect. This is a far cry from the simple request-response patterns most people imagine when thinking about AI tool usage — it is a sophisticated application framework.

Key Insight: The harness architecture demonstrates that building a production-grade AI agent requires far more than connecting an LLM to a set of tools. The orchestration logic — how the agent decides what to do, when to stop, and how to recover from errors — is where the real engineering investment lies.

System Prompts and Tool Loop Design

One of the most instructive aspects of the leaked code is how Anthropic structures the system prompt that tells Claude how to behave as a coding assistant. Rather than a single monolithic prompt, Claude Code uses a modular system prompt with cache-aware boundaries designed to maximize prompt caching efficiency while keeping instructions current.

Modular Prompt Architecture

The system prompt is assembled from multiple segments, each serving a distinct purpose: base behavior instructions, tool-specific guidance, project context (from CLAUDE.md files), and session-specific state. Cache boundaries are placed between segments so that stable content (like base instructions) can be cached across requests while dynamic content (like current file context) is refreshed each turn.

This matters for cost optimization. Prompt caching can reduce token costs by up to 90% for repeated prefixes. By carefully structuring which parts of the prompt change between turns and which remain stable, Anthropic minimizes API costs while maintaining full contextual awareness.

The Tool Execution Loop

The tool loop is the core mechanism by which Claude Code executes multi-step tasks. When the model determines it needs to use a tool — reading a file, executing a command, searching code — it emits a tool call, the harness executes it, and the result is fed back into the conversation for the next reasoning step.

The leaked code reveals how Anthropic handles the critical challenges of tool loops:

Permission enforcement — each tool call is validated against the user's permission configuration before execution, with different levels for file reads, writes, and command execution
Error recovery — failed tool calls are caught and fed back to the model with structured error information so it can adjust its approach rather than crashing
Timeout management — long-running operations are bounded by configurable timeouts to prevent the agent from getting stuck in infinite loops
Output truncation — large tool outputs are intelligently truncated to fit within context limits while preserving the most relevant information

For AI Agent Builders: The tool loop design is directly applicable to anyone building custom AI agents. The pattern of structured tool execution with permission gates, error recovery, and output management is a proven production pattern.

44 Feature Flags: What Anthropic Kept Hidden

Perhaps the most widely discussed aspect of the leak was the discovery of 44 compile-time feature flags pointing to unreleased functionality. These features are fully implemented in the source but stripped from external builds via Bun's compile-time dead code elimination. The code is there — 108 feature-gated modules in total — but it compiles to nothing in the version users install.

The unreleased capabilities reveal Anthropic's product roadmap in remarkable detail. Here are the most significant:

KAIROS (Daemon Mode)

Referenced 150+ times in source

Transforms Claude Code into a persistent background agent that proactively monitors your project and takes autonomous action when appropriate.

ULTRAPLAN (Remote Planning)

Cloud-based task offloading

Offloads complex planning tasks to a cloud container running Opus 4.6 for up to 30 minutes of deep reasoning on architectural decisions.

Coordinator Mode

Multi-agent collaboration

Enables multiple Claude Code instances to coordinate on a shared project with task delegation, conflict resolution, and shared state management.

Voice Mode

Full push-to-talk interface

A complete voice interaction system with push-to-talk controls, allowing developers to direct Claude Code through spoken commands.

The most controversial discovery was undercover.ts — roughly 90 lines that inject a system prompt instructing Claude to never mention it is an AI and to strip all Co-Authored-By attribution when contributing to external repositories. While the exact use case is unclear (it may be for specific enterprise scenarios where AI disclosure is handled differently), the existence of such a mode raised questions about transparency in AI-assisted development.

Context Matters: Feature flags in software development are standard practice for testing unreleased capabilities. The presence of a feature flag does not guarantee a feature will ship. Many of these capabilities may be experimental prototypes that never reach production.

KAIROS: The Autonomous Daemon Mode

KAIROS deserves its own section because it represents a fundamental shift in the AI coding assistant paradigm. Named after the Ancient Greek concept of "the right time" — the opportune moment for action — KAIROS would transform Claude Code from a tool you invoke into an assistant that runs alongside you, deciding autonomously when to act and when to stay silent.

How KAIROS Works (According to the Leaked Code)

The leaked source reveals a well-engineered daemon architecture:

Periodic Tick Prompts — KAIROS receives regular <tick> prompts that provide current project state. At each tick, it evaluates whether proactive action is warranted or whether it should remain dormant.
Append-Only Daily Logs — All observations and actions are recorded in immutable daily log files, creating an audit trail of what the agent noticed and what it chose to do (or not do).
GitHub Webhook Subscriptions — KAIROS can subscribe to repository events, allowing it to respond to pull requests, issue creation, CI failures, and other development lifecycle events.
15-Second Blocking Budget — Proactive actions are constrained to a 15-second blocking budget, ensuring the daemon never interrupts the developer's flow for more than a brief pause.

autoDream: Background Memory Consolidation

A companion feature called autoDream, found in the services/autoDream/ directory, runs as a forked sub-agent during idle periods. Its purpose is memory consolidation: merging observations from across sessions, removing logical contradictions, and converting tentative notes into confirmed facts. This is analogous to how human memory consolidation works during sleep — processing the day's experiences into durable knowledge.

Together, KAIROS and autoDream point toward a future where AI coding assistants are not just tools that respond to commands, but persistent collaborators that build understanding of your project over time.

Memory Architecture and Context Management

The leaked source reveals a sophisticated three-layer memory architecture that addresses one of the fundamental challenges of LLM-based agents: maintaining relevant context across interactions while operating within fixed context windows.

Three-Layer Memory Architecture

MEMORY.md Index (Always Loaded)

A lightweight pointer file (~150 characters per line) perpetually loaded into context. Acts as a table of contents pointing to detailed knowledge stored elsewhere.

Session Context (Active Conversation)

The active conversation history, tool call results, and working state within the current context window. Managed with intelligent eviction strategies.

Persistent Knowledge (Cross-Session)

File-based memory that persists across sessions, storing project-specific knowledge, learned patterns, and consolidated observations.

This architecture directly informs best practices for building AI agent memory systems. The key insight is that memory should be layered by access frequency and persistence: always-on pointers, session-scoped working memory, and durable cross-session knowledge.

The MEMORY.md approach is particularly elegant. Rather than trying to fit all project knowledge into the context window, the agent maintains a compact index of pointers and loads detailed knowledge only when relevant. This mirrors how human experts work — you don't keep every fact in working memory, but you know where to look.

Security Implications and Supply Chain Risks

Beyond the intellectual property concerns, the Claude Code leak exposed real security risks that affect anyone using npm-distributed AI tools.

The Trojanized Dependency Window

Users who installed or updated Claude Code via npm on March 31, 2026, between 00:21 and 03:29 UTC may have pulled a trojanized version of an HTTP client dependency. This dependency contained a cross-platform remote access trojan — a significant supply chain attack that exploited the confusion surrounding the leak.

Action Required: If you installed or updated Claude Code via npm during that window, Anthropic recommends verifying your installation against known-good checksums and scanning your system for unauthorized network connections.

Exposed Attack Surfaces

The leaked orchestration logic also creates indirect security risks. With full visibility into how Claude Code's hooks and MCP servers work, attackers could potentially design malicious repositories — containing crafted CLAUDE.md files or package configurations — intended to trick Claude Code into executing harmful commands when a developer opens the project.

Anthropic's response included issuing DMCA takedown requests to GitHub, which initially resulted in the takedown of thousands of repositories — including some that were unrelated to the leak. Anthropic later acknowledged this overbroad response was itself an error, further complicating the incident narrative.

npm Security Lessons

The incident highlights a systemic risk in the npm ecosystem. A single misconfigured field in package.json can expose an entire proprietary codebase. For teams publishing npm packages, the essential safeguards are:

Always run npm pack --dry-run before publishing to verify package contents
Use an explicit files allowlist in package.json rather than relying on .npmignore exclusions
Implement CI checks that validate package size and file counts against expected baselines
Never include source maps, test files, or internal documentation in published packages

Lessons for AI Builders

Regardless of how one feels about the ethics of analyzing leaked code, the architectural patterns revealed in the Claude Code source provide a blueprint for building production-grade AI agents. Here are the key takeaways for development teams.

1. The Harness Is the Product

The orchestration layer — not the LLM — determines an AI agent's real-world capabilities. Invest as much in your harness architecture as you do in model selection.

2. Permission Gates Are Non-Negotiable

Every tool call should be validated against explicit permission policies. Claude Code's layered permission model (read / write / execute) is a proven pattern.

3. Layer Your Memory System

Separate always-loaded context (pointers) from session-scoped state from persistent knowledge. This maximizes context efficiency.

4. Feature Flags with Dead Code Elimination

Use compile-time feature flags so unreleased functionality never ships in production builds. Bun's tree-shaking makes this seamless for JavaScript/TypeScript projects.

5. Modular System Prompts

Structure system prompts with cache-aware boundaries. Stable instructions at the top, dynamic context at the bottom. This dramatically reduces API costs.

6. Graceful Error Recovery

Feed structured error information back to the model rather than crashing on tool failures. The agent should adapt its approach, not halt.

These patterns are not exclusive to coding assistants. Any organization building AI-powered workflows — from enterprise AI transformation to customer service automation — can benefit from the architectural principles the Claude Code leak made visible.

Strategic Outlook: What This Means for AI Development

The Claude Code leak is a watershed moment for the AI development industry — not because of the leak itself, but because of what the code reveals about where AI coding assistants are headed. The feature flags paint a clear picture of the near-term future: persistent AI agents, multi-agent collaboration, voice-driven development, and cloud-offloaded deep reasoning.

The Shift from Reactive to Proactive AI

KAIROS and autoDream signal the most important architectural shift. Today's AI coding tools are reactive — you ask, they respond. Tomorrow's will be proactive — monitoring your project, building understanding over time, and acting autonomously when the moment is right. This shift will fundamentally change how development teams operate, moving from "human directs AI" to "human and AI collaborate continuously."

Competitive Implications

The leak also levels the playing field in unexpected ways. Competitors like Cursor, Windsurf, and GitHub Copilot now have a detailed blueprint of Claude Code's architecture. Open source projects like Claurst (a Rust reimplementation) are already using the leaked design as a reference. The next generation of AI coding tools will be built on patterns that Anthropic pioneered but can no longer exclusively own.

For development teams evaluating AI tools, the lesson is clear: the quality of the agentic harness matters as much as the quality of the underlying model. When comparing AI coding assistants, look beyond benchmark scores to the orchestration architecture — how the tool manages context, handles errors, enforces permissions, and enables multi-step reasoning.

Build Smarter AI-Powered Workflows

The architecture patterns revealed in the Claude Code leak apply directly to building custom AI agents and automated workflows. Let our team help you implement these patterns for your business.

Get Started Explore AI Transformation Services

Free consultation

Expert guidance

Proven architecture patterns