
In mid-September 2025, Chinese state-sponsored attackers used artificial intelligence (AI) technology developed by Anthropic to orchestrate an automated cyberattack as part of a “highly sophisticated espionage campaign.”
“The attackers exploited AI’s ‘agent’ capabilities to an unprecedented degree, using AI not only as an advisor but also to carry out the cyberattack itself,” the AI startup said.
This activity is assessed as an attempt to manipulate Anthropic’s AI coding tool, Claude Code, to compromise approximately 30 global targets across major technology companies, financial institutions, chemical manufacturers, and government agencies. Some of these incursions were successful. Since then, Anthropic has banned the relevant accounts and strengthened its defense mechanisms to warn of such attacks.
This campaign, GTG-1002, marks the first time that a threat actor has leveraged AI to carry out a “massive cyber attack” to attack high-value targets and gather intelligence without extensive human intervention, demonstrating the continued evolution of adversarial uses of this technology.

Anthropic described the operation as well-resourced and professionally coordinated, saying the attackers turned Claude into an “autonomous cyberattack agent” that supported various stages of the attack lifecycle, including reconnaissance, vulnerability discovery, exploitation, lateral movement, credential collection, data analysis, and exfiltration.
Specifically, it involves the use of Claude code and Model Context Protocol (MCP) tools, with the former acting as a central nervous system processing human operator instructions and breaking down multi-stage attacks into smaller technical tasks that can be offloaded to subagents.
“Human operators instructed instances of Claude code to act in groups as autonomous penetration testing orchestrators and agents. Threat actors can leverage AI to independently perform 80-90% of tactical operations at request rates that are physically impossible,” the company added. “Human responsibility is focused on campaign initialization and approval decisions at key escalation points.”
Human involvement also occurred at strategic crossroads, such as approving progression from reconnaissance to active exploitation, approving the use of collected credentials for lateral movement, and making final decisions about the scope and retention of data exfiltration.

The system is part of an attack framework that takes as input a target of interest from a human operator and leverages the capabilities of the MCP to perform reconnaissance and attack surface mapping. In the next stage of the attack, the Claude-based framework facilitates vulnerability discovery and validates the discovered flaws by generating a customized attack payload.
Upon receiving approval from the human operator, the system begins deploying the exploit to gain foothold and initiates a series of post-exploitation activities including credential collection, lateral movement, data collection, and extraction.
In one case targeting an anonymous technology company, the attackers allegedly instructed Claude to independently query its databases and systems, parse the results, flag sensitive information, and group the results by intelligence value. Additionally, Anthropic said its AI tools generate detailed attack documentation at every stage, potentially allowing attackers to hand over persistent access to additional teams for long-term operations after the initial wave.
According to the report, “By presenting these tasks to Claude as routine technical requests through carefully crafted prompts and established personas, the attacker was able to force Claude to execute individual components of the attack chain without accessing the broader malicious context.”
There is no evidence that the operational infrastructure enabled the development of custom malware. Rather, they have been found to rely extensively on publicly available network scanners, database exploitation frameworks, password crackers, and binary analysis suites.

However, an examination of this activity also revealed significant limitations of AI tools. AI tools are a major impediment to the overall effectiveness of this plan, as they tend to hallucinate and fabricate data during autonomous operations, creating false credentials or presenting publicly available information as important findings.
This disclosure comes nearly four months after Anthropic thwarted another sophisticated operation in July 2025, weaponizing Claude to carry out large-scale theft of personal data and extortion. Over the past two months, OpenAI and Google have also uncovered attacks launched by threat actors leveraging ChatGPT and Gemini, respectively.
“This campaign demonstrates that the barriers to conducting sophisticated cyberattacks have been significantly lowered,” the company said.
“Threat actors can now use agent AI systems to do the work of entire teams of experienced hackers with the right setup to analyze target systems, write exploit code, and scan vast datasets of stolen information more efficiently than human operators. Even inexperienced and low-resource groups can potentially carry out these types of large-scale attacks.”
Source link
