
Anthropic on Tuesday admitted that the internal code of its popular artificial intelligence (AI) coding assistant, Claude Code, was accidentally released due to human error.
“No sensitive customer data or credentials were involved or exposed,” an Anthropic spokesperson said in a statement shared with CNBC News. “This is a release package issue caused by human error and is not a security breach. We are taking steps to ensure this never happens again.”
This discovery comes after the AI startup released version 2.1.88 of the Claude Code npm package, which users discovered contained a source map file that could be used to access Claude Code’s source code, which consists of approximately 2,000 TypeScript files and over 512,000 lines of code. This version is no longer available for download from npm.
Security researcher Chaofan Shou first publicly reported the issue in X, saying, “The source code of the Claude code was leaked via a map file in the npm registry!” Since then, X’s post has garnered more than 28.8 million views. The leaked codebase is accessible via a public GitHub repository and has over 84,000 stars and 82,000 forks.
This kind of source code leak is significant because it gives software developers and Anthropic’s competitors a blueprint for how the popular coding tool works. Users who have delved into the code have revealed details of the self-healing memory architecture and other internal components to overcome the constraints of the model’s fixed context window.
These include a tool system that facilitates various functions such as file reading and bash execution, a query engine that handles LLM API calls and orchestration, multi-agent orchestration that spawns “subagents” or swarms that perform complex tasks, and a bidirectional communication layer that connects IDE extensions to the Claude Code CLI.
The leak also revealed a feature called KAIROS that allows Claude Code to act as a persistent background agent. This allows you to regularly fix errors, run tasks on your own, and send push notifications to users without waiting for human input. Complementing this proactive mode is a new “dream” mode that allows Claude to constantly think in the background, developing ideas and iterating on existing ideas.

Perhaps the most interesting detail is the tool’s Undercover mode for making “stealth” contributions to open source repositories. The system prompts you: “You are working with UNDERCOVER in a PUBLIC/open source repository. Commit messages, PR titles, and PR bodies must not contain any internal Anthropic information. Don’t blow your cover.”
Another interesting discovery involves Anthropic’s attempts to covertly combat model distillation attacks. The system introduces controls that pollute training data by injecting fake tool definitions into API requests if a competitor attempts to scrape Claude Code’s output.
Typosquat npm package is pushed to the registry
With the internal workings of Claude Code revealed, Development Risk provides ammunition for malicious attackers to bypass guardrails and trick systems into performing unintended actions, such as executing malicious commands or leaking data.
“Instead of a brute-force jailbreak or prompt injection, an attacker can examine and fuzz exactly how data flows through the four-stage context management pipeline of the code, creating a payload designed to withstand compression, effectively making the backdoor persist for arbitrarily long sessions,” said AI security firm Straiker.
A more pressing concern is the aftermath of the Axios supply chain attack. Between 00:21 and 03:29 UTC on March 31, 2026, users who installed or updated the Claude code via npm may have pulled a trojanized version of the HTTP client containing the Cross-Platform Remote Access Trojan. We recommend that users immediately downgrade to a secure version and rotate all secrets.
Additionally, attackers are already targeting people who are using the leak to typosquat internal npm package names and compile the source code of the leaked code to launch dependency confusion attacks. All package names have been published by a user named ‘pacifier136’ and are listed below.
audio capture-napi color-diff-napi image processor-napi modifier-napi url-handler-napi
“Right now they are empty stubs (`module.exports = {}`), but this is how these attacks work: they misuse the name, wait for the download, and then push a malicious update that attacks everyone who installs it,” security researcher Clement Dumas said in a post on X.
This incident is the second major failure for Anthropic in a week. Details about the company’s upcoming AI models, along with other internal data, became accessible via the company’s content management system (CMS) last week. Anthropic later confirmed that it was testing the model with early access customers and said it was “the highest performing we’ve ever built” (Fortune).
Source link
