AI Development9 min readFeatured Guide

Anthropic Distillation Attacks: DeepSeek, Moonshot, MiniMax

Anthropic accuses DeepSeek, Moonshot AI, and MiniMax of industrial-scale distillation via 24,000 fake accounts and 16M+ Claude exchanges. Full analysis inside.

Digital Applied Team

February 24, 2026

9 min read

24,000

Fraudulent accounts created

16M+

Claude exchanges recorded

20,000+

Accounts per proxy cluster

Chinese AI labs implicated

Key Takeaways

Industrial-scale API exploitation: 24,000 fraudulent accounts and 16M+ exchanges extracted reasoning and tool-use capabilities from Claude over months of sustained activity

Three Chinese labs implicated: DeepSeek, Moonshot AI, and MiniMax used hydra cluster architectures managing 20,000+ accounts each to disguise distillation traffic as legitimate usage

Safety guardrails are the real risk: Distilled models lack the RLHF safety training of the source model, creating unconstrained systems that can bypass refusals and safety filters

Hypocrisy debate splits community: Critics say frontier labs trained on public data cannot complain about extraction; defenders note API distillation specifically targets safety-bypassed reasoning

On February 23, 2026, Anthropic published what may be the most detailed account of industrial-scale model theft in AI history. The company accused three Chinese AI labs — DeepSeek, Moonshot AI, and MiniMax — of conducting systematic distillation attacks against Claude using 24,000 fraudulent accounts and over 16 million exchanges. The operation extracted reasoning capabilities, tool-use patterns, and chain-of-thought processes that took Anthropic years and billions of dollars to develop.

The allegations land at a moment of maximum geopolitical tension over AI capabilities. The Trump administration is debating H200 chip exports to China, Anthropic CEO Dario Amodei testified before the House Homeland Security Committee on AI risks, and DeepSeek V4 just demonstrated capabilities that rival frontier Western models at a fraction of the cost. Whether you see this as legitimate intellectual property protection or competitive gatekeeping depends on where you stand — and both positions have merit.

Disclosure: This article analyzes Anthropic's claims and the community response. Digital Applied uses Claude in its operations. We present both sides of the debate and encourage readers to evaluate the evidence independently.

What Anthropic Claims Happened

Anthropic's report details a sustained extraction campaign spanning months. The three accused companies allegedly created thousands of accounts through automated registration systems, then used those accounts to systematically query Claude with prompts designed to elicit its internal reasoning, tool-use capabilities, and safety-boundary behaviors.

Per-Company Breakdown

DeepSeek

~150,000

exchanges recorded. Targeted advanced reasoning and chain-of-thought capabilities. Smallest volume but most precisely targeted queries.

Moonshot AI

~3.4 million

exchanges recorded. Broader extraction campaign spanning tool use, coding, and multi-turn conversation patterns.

MiniMax

~13 million

exchanges recorded. Largest volume by far. Wide-spectrum extraction covering general reasoning, safety boundaries, and conversational patterns.

The Hydra Cluster Architecture

The report describes what Anthropic calls “hydra clusters” — proxy architectures that managed 20,000+ accounts simultaneously. Each cluster rotated through accounts to distribute API traffic, mixed distillation queries with legitimate-seeming requests, and used geographic IP distribution to avoid triggering rate-limiting or abuse detection. The name reflects the multi-headed nature of the operation: cut off one account, and thousands more continue the extraction.

Understanding AI security risks? Whether you are protecting proprietary models or evaluating AI vendors, security matters. Explore our AI & Digital Transformation services for expert guidance on AI security and implementation.

What Is Model Distillation and Why It Matters

Model distillation is a well-established technique in machine learning where a smaller “student” model learns to replicate the behavior of a larger “teacher” model. In its legitimate form, it is how companies deploy efficient models to edge devices, reduce inference costs, and create specialized models for narrow tasks. Google, Meta, and Anthropic itself use distillation internally.

Legitimate Use vs. Competitive Extraction

The line between legitimate distillation and what Anthropic describes is about consent and scale. A developer using Claude's API to build an application generates data as a byproduct. A company creating 24,000 fake accounts to systematically generate training data is something categorically different. The intent is not to use the model but to replicate it.

How Distillation Extraction Works

Prompt variation: Thousands of paraphrased versions of the same question to capture the model's reasoning distribution, not just a single answer
Chain-of-thought elicitation: Prompts specifically crafted to make the model “show its work,” revealing internal reasoning steps
Safety boundary probing: Queries designed to map exactly where the model refuses requests, enabling the student model to learn the boundaries without the corresponding safety training
Tool-use extraction: Complex multi-step tasks that force the model to demonstrate agentic capabilities — function calling, planning, and error recovery

The Safety Guardrail Problem

This is where distillation becomes a security concern rather than just an intellectual property issue. When you distill a model through its outputs, you capture its capabilities but not its safety training. The RLHF (Reinforcement Learning from Human Feedback), Constitutional AI constraints, and red-team-tested refusal mechanisms that Anthropic spent years developing do not transfer through API outputs. The resulting student model can replicate Claude's reasoning power without Claude's safety filters — creating what researchers call an “unconstrained” model.

How Anthropic Detected the Attacks

Anthropic's report provides unusual technical detail about its detection methodology. Ironically, the scale of the operation — which gave the accused companies more training data — also made detection more feasible. Patterns that would be invisible across dozens of accounts become statistically significant across thousands.

IP Address Correlation

Despite geographic distribution, clusters of accounts shared infrastructure patterns — similar IP ranges, identical timing distributions, and coordinated registration timestamps that revealed centralized orchestration.

Behavioral Fingerprinting

Normal users show diverse query patterns reflecting different tasks and skill levels. The flagged accounts shared distinctive prompt structures — systematic prompt variation, consistent formatting, and query sequences that followed extraction methodologies rather than genuine usage.

Chain-of-Thought Detection

A disproportionate number of queries from flagged accounts included prompts designed to elicit step-by-step reasoning — “think through this carefully,” “explain your reasoning,” “walk me through your approach” — at rates far exceeding normal API usage.

Statistical Pattern Recognition

The query distribution across flagged accounts was statistically inconsistent with organic usage. Normal traffic follows power-law distributions with heavy tails. The distillation traffic showed uniform coverage patterns — systematically exploring capability space rather than solving specific problems.

Scale as a vulnerability: The accused companies needed millions of exchanges to generate sufficient training data. But that volume also created a statistical footprint that smaller operations could have avoided. Anthropic notes this creates an inherent tension for would-be extractors: you need scale, but scale makes you visible.

The Geopolitical Context

Anthropic's report did not arrive in a vacuum. The timing intersects with several major developments in the U.S.–China AI competition, and understanding that context is essential to evaluating the allegations fairly.

DeepSeek V4 Launch

DeepSeek's latest model demonstrated reasoning capabilities approaching frontier Western models at dramatically lower cost. The distillation allegations raise questions about how much of that capability was independently developed versus extracted from competitors. Read our DeepSeek V4 guide for the full technical breakdown.

Export Control Debates

The Trump administration is actively debating whether to allow H200 GPU exports to China. Anthropic CEO Dario Amodei has publicly called for tighter export controls, arguing that API-level distillation makes hardware restrictions insufficient if software capabilities can be extracted directly.

Congressional Testimony

Amodei testified before the House Homeland Security Committee on AI safety risks just weeks before the report. His testimony emphasized the danger of AI capability transfer to adversarial nations — a narrative that the distillation report directly supports.

OpenAI's Similar Claims

OpenAI has made parallel accusations about Chinese labs distilling from GPT-4 and its successors. Anthropic's report provides more technical detail, but the pattern of Western frontier labs accusing Chinese competitors of model extraction is now well-established across the industry.

Critics point out that the timing serves Anthropic's policy advocacy. Supporters argue that the technical evidence stands regardless of timing. Both observations can be true simultaneously — the evidence should be evaluated on its own merits, while acknowledging the strategic context.

The Community Backlash

The AI community's response has been sharply divided, and both sides make substantive arguments worth considering.

The Hypocrisy Argument

Frontier labs trained on vast quantities of copyrighted web data, books, and code without consent or compensation
The New York Times, Getty Images, and thousands of authors have active lawsuits against OpenAI, Google, and Meta for training data usage
Anthropic itself faces a lawsuit from music publishers over copyrighted lyrics in training data
“You cannot build your model on everyone else's work, then cry foul when someone builds on yours” — a common refrain in developer communities

The Safety Argument

API distillation specifically replicates tool-use and agentic capabilities — not just text generation
Distilled models lack safety training, creating systems that can bypass refusals the source model was designed to enforce
Training on web data (even copyrighted) is qualitatively different from targeted extraction of a specific system's capabilities
The safety implications of unconstrained distilled models represent a genuine risk regardless of the intellectual property debate

The most honest assessment acknowledges both positions. The frontier labs do have a credibility problem on intellectual property — they built their businesses on contested training data practices. But the safety argument about distillation is substantively different from the copyright debate. Extracting a model's safety boundaries to create an unconstrained clone raises risks that web-scraping for training data does not. These are two separate conversations that keep getting conflated.

National Security and Safety Implications

Beyond the intellectual property and competitive dimensions, the distillation allegations raise concrete safety questions that warrant serious attention regardless of one's position on the hypocrisy debate.

The Unconstrained Model Problem

When you distill a model through its outputs, the student model learns what the teacher can do — but not what it was trained not to do. Anthropic's Constitutional AI framework, its RLHF-trained refusal mechanisms, and its red-team-tested safety boundaries do not transfer through API outputs. A distilled model can potentially generate content that Claude would refuse: detailed instructions for harmful activities, persuasive disinformation at scale, or offensive cyber tools without the safety guardrails.

Offensive Cyber Risk

Frontier models have demonstrated capability in vulnerability discovery, exploit generation, and offensive security research. These capabilities are deliberately constrained through safety training. Distilled versions without those constraints could lower the barrier for sophisticated cyber operations.

Disinformation Risk

Claude's reasoning and persuasion capabilities — designed for helpful dialogue — could be repurposed in an unconstrained model for generating targeted disinformation, social engineering campaigns, or influence operations at a scale that manual creation cannot match.

Who Was Not Accused

Notably, Anthropic's report does not implicate all Chinese AI companies. Alibaba's Qwen team and Z.ai are explicitly not named. This selectivity lends some credibility to the allegations — a purely political report might cast a wider net. It also suggests that the three accused companies' behavior was distinguishable from normal API usage patterns of other Chinese AI organizations.

As of publication, none of the three accused companies had issued public responses to the allegations. For a deeper look at how Claude's capabilities compare across models, see our Claude Sonnet 4.6 benchmarks and pricing guide.

What This Means for AI Development

Regardless of how the specific allegations play out, the Anthropic distillation report signals several durable shifts in the AI industry that businesses and developers should plan for.

API Protections Will Tighten

Expect stricter rate-limiting, more sophisticated abuse detection, and potentially identity verification requirements for API access across all frontier model providers. This will increase friction for legitimate developers and may drive up API costs.

Open-Source Dynamics Shift

The distillation debate strengthens the argument for open-weight models (like Meta's Llama) while simultaneously raising questions about what those open models were trained on. The distinction between “open-weight” and “open-source” will matter more as these debates intensify.

Safety Becomes a Competitive Differentiator

Companies that can demonstrate their models were not built on extracted capabilities — and that include robust safety training — will have a competitive advantage in regulated industries and government contracts.

Business Planning Must Account for Supply Chain Risk

Organizations building on AI models need to understand the provenance of those models. If a model's capabilities were derived from unauthorized distillation, downstream users could face legal, reputational, or operational risks as enforcement evolves.

Navigate AI Security with Confidence

From model selection to security assessment, our AI experts help you make informed decisions about AI adoption and protect your competitive advantage.

Get Started Explore AI Services

Free consultation

Expert guidance

Tailored solutions