Anthropic says Chinese AI company used 16 million Claude queries to copy model

Ravi LakshmananFebruary 24, 2026Artificial Intelligence / Humanity

Anthropic announced on Monday that it had identified an “industrial-scale campaign” in which three artificial intelligence (AI) companies, Deep Seek, Moonshot AI, and MiniMax, illegally extracted Claude’s abilities to improve their models.

This distillation attack resulted in more than 16 million interactions with its large-scale language model (LLM) via approximately 24,000 fraudulent accounts that violated terms of service and regional access restrictions. All three companies are based in China, where their services are prohibited due to “legal, regulatory, and security risks.”

Distillation refers to the technique of training a less capable model based on the output produced by a more powerful AI system. Distillation is a legal way for companies to produce smaller, cheaper versions of their Frontier models, but it’s illegal for competitors to use it to acquire such capabilities from other AI companies in a fraction of the time and cost it would take to develop them in-house.

“Illegally distilled models lack the necessary safeguards and pose a significant national security risk,” Antropic said. “Models built through illegal distillation are unlikely to retain these safeguards, meaning that dangerous features can proliferate when many protective features are completely stripped away.”

Foreign AI companies distilling the American model could weaponize these unprotected capabilities to facilitate cyber-related or other malicious activities, thereby serving as the basis for military, intelligence, and surveillance systems that authoritarian governments could deploy for offensive cyber operations, disinformation campaigns, and mass surveillance.

The campaign detailed by the AI startup involves the use of fraudulent accounts and commercial proxy services to access Claude at scale while avoiding detection. Anthropic said it was able to attribute each campaign to a specific AI lab based on request metadata, IP address correlation, request metadata, and infrastructure metrics.

The details of the three distillation attacks are as follows:

DeepSeek targeted Claude’s reasoning ability, a rubric-based scoring task, and enlisted the help of more than 150,000 exchanges in generating censorship-safe alternatives for politically sensitive questions, including questions about dissidents, party leaders, and authoritarianism. Moonshot AI covered Claude’s agent inference and tool usage, coding capabilities, computer-enabled agent development, and computer vision across more than 3.4 million exchanges. MiniMax targeted Claude’s agent coding and tool usage capabilities across over 13 million exchanges.

“The amount, structure, and focus of the prompts differ from normal usage patterns and reflect deliberate feature extraction rather than legitimate use,” Antropic added. “Each campaign targeted Claude’s most differentiated capabilities: agent reasoning, tool usage, and coding.”

The company also noted that the attack relied on commercial proxy services that resell access to Claude and other Frontier AI models at scale. These services utilize a “hydra cluster” architecture that includes a large network of fraudulent accounts to distribute traffic across the API.

This access is then used to generate a large number of carefully crafted prompts designed to extract specific features from the model in order to collect high-quality responses and train your own model.

“The breadth of these networks means there is no single point of failure,” Antropic said. “When one account is banned, a new one is used in its place. In one case, a single proxy network managed over 20,000 fraudulent accounts simultaneously, mixing distilled traffic with unrelated customer requests and making detection difficult.”

To combat this threat, Anthropic said it has built several classifiers and behavioral fingerprinting systems to identify suspicious distillation attack patterns in API traffic, strengthened validation of educational accounts, security research programs, and startup organizations, and implemented enhanced safeguards to reduce the effectiveness of model outputs against unauthorized distillation.

This disclosure comes weeks after the Google Threat Intelligence Group (GTIG) revealed that it had identified and disrupted distillation and model extraction attacks targeting Gemini’s inference capabilities through over 100,000 prompts.

Google said earlier this month that “model extraction and distillation attacks do not threaten the confidentiality, availability, or integrity of our AI services, so they typically do not pose a risk to the average user.” “Rather, the risk is concentrated on model developers and service providers.”

Source link

What's Hot

Hydrogen and RoboticsCo sign first actuator and robotics deal

Anthropic says Chinese AI company used 16 million Claude queries to copy model

Meta AI security researcher said OpenClaw agent is rampant in inboxes

Anthropic says Chinese AI company used 16 million Claude queries to copy model

APT28 uses webhook-based macro malware to target European companies

Wormable XMRig campaign uses BYOVD exploit and time-based logic bombs

Double-Tap Skimmers, PromptSpy AI, 30Tbps DDoS, Docker Malware & More

Hydrogen and RoboticsCo sign first actuator and robotics deal

Anthropic says Chinese AI company used 16 million Claude queries to copy model

Meta AI security researcher said OpenClaw agent is rampant in inboxes

The battle between Tesla and the California Department of Transportation is far from over.

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

What's Hot

Anthropic says Chinese AI company used 16 million Claude queries to copy model

Related Posts