Anthropic’s announcement of the Claude Mythos Preview model alongside Project Glasswing has sparked widespread scrutiny as experts warn that the capabilities of artificial intelligence (AI) systems could accelerate the discovery and exploitation of software vulnerabilities.
Anthropic has locked Mythos within Project Glasswing, restricting access to a small group of large technology companies focused on cybersecurity in the company’s attempt to contain and direct this model. Anthropic’s decision not to make Mythos publicly available quickly led to claims that the model was “too powerful” for widespread use.
you may like
“Anthropic’s Mythos Preview is a warning to the entire industry, and the fact that Anthropic itself has chosen not to publicly release it says everything about the feature threshold we are currently crossing,” Camellia Chan, CEO and co-founder of hardware-based cybersecurity company X-PHY, told Live Science.
But what can Mythos actually do, and can it be controlled?
What is Mythos and what can it do?
According to Anthropic’s own description, Mythos is its most capable model to date, with unusually strong performance in coding and long-context inference. In testing, that functionality was translated into real output. The model identified thousands of critical vulnerabilities across major operating systems and browsers, including flaws that had gone unnoticed for decades.
Mythos sits on top of Anthropic’s Claude model, but calling it an “update” understates its capabilities. Based on information shared by Anthropic representatives and details revealed through leaks, the system is built to handle large and messy codebases without losing threads along the way.
Unlike previous models, which often interrupted tasks in the middle, Mythos can read the software, flag gaps, and turn those gaps into something usable. According to Anthropic representatives, Mythos can turn both newly discovered flaws and known vulnerabilities into viable exploits, including for software for which the source code is not available.
The difference between Mythos and the previous model is that the new model does not stop. While early AI models were prone to stalling or needing nudges, Mythos solves problems and continues testing and tweaking until you arrive at a working exploit.
Anthropic hasn’t shared much about how they built Mythos or its underlying architecture. But what is clear is that AI is not just generating answers to questions. You can interact with your code, run checks, and use the results to decide what to do next. This brings us closer to actually testing the system rather than just analyzing it.
What to read next
When AI can generate actionable zero-day exploits at high speed, organizations lose the breathing space they traditionally relied on for detection, patching, and recovery.
Camellia Chan, X-PHY CEO and Co-Founder
This represents a significant change from the behavior of previous models. Instead of pointing out where something might break, you can try things out, see what happens, and change your approach if necessary. It also seems like you can do the work across multiple steps without having to reset it each time. Instead of starting from the beginning, you pick up where you left off.
This does not mean that it is working independently, but it does indicate that it can proceed further with the task before human intervention. Anthropic said the model performed so well on existing cybersecurity benchmarks that the usefulness of those benchmarks has diminished and calls for evaluation in more realistic real-world scenarios.
How did scientists test Mythos?
In Anthropic scientists’ own testing, the model identified vulnerabilities in modern browser environments and chained multiple flaws into real-world exploits, including attacks that bypassed both browser and operating system sandboxes. In practice, it means tying a small weakness that may be harmless on its own to something that can reach deep into the system. Sandboxing is intended to keep software contained. Breaking through them allows code to access parts of the system that it shouldn’t.
“In one case, Mythos Preview created a web browser exploit that chained four vulnerabilities together to create a complex JIT heap spray. [a trick attackers use to smuggle malicious code into memory and then make the system run it] escaped both the renderer and the OS sandbox,” the scientists said in a report released on April 7.
“We autonomously obtained a local privilege escalation exploit on Linux and other operating systems by exploiting a subtle race condition and KASLR bypass. We also autonomously created a remote code execution exploit on a FreeBSD NFS server that split a 20-gadget ROP chain into multiple packets, allowing unauthenticated users full root access.”
Additionally, Mythos has the potential to turn both newly discovered flaws and known vulnerabilities into working exploits, often on the first try, Anthropic representatives said. In some cases, these exploits may be created using models by human engineers with no formal security training.
The most concerning aspect of Mythos’ functionality is that previous versions are said to have breached the sandbox and accessed external systems, raising questions about how well the system can be contained, Chan said.
Chan addressed these concerns directly, telling Live Science that Mythos exhibited “unsanctioned autonomous behavior.”
“When AI can generate actionable zero-day exploits at high speed, organizations lose the breathing space they have traditionally relied on for detection, patching, and recovery,” said Chan.
Representatives from Anthropic said only a few of the widely used software vulnerabilities discovered in the model can be publicly accounted for because most have not been patched, making independent verification difficult.
What is Project Glasswing? What does it mean for Mythos?
Project Glasswing is Anthropic’s attempt to contain and direct the functionality of Mythos. Rather than releasing Mythos as a general-purpose model, the company is providing access through a controlled framework that connects technology companies and security organizations. The stated purpose is to use this model to identify and remediate vulnerabilities in widely used software before they can be exploited.
This is not a one-time thing. AI companies are starting to throttle their most performing models and limit who can access them, especially when abuse is a real concern.
F5 Labs threat research director David Warburton said this type of collaboration is a positive step, but warned that it comes within a broader context in which state-sponsored cybercriminals are already making significant investments in offensive and defensive capabilities.
“What’s meaningfully changing is the pace,” he told Live Science, noting that advances in AI are accelerating both the discovery and exploitation of vulnerabilities.
The industry continues to make the same mistakes. This means that it relies on the software layer to resolve problems that occur within the software layer.
Camellia Chan, X-PHY CEO and Co-Founder
Software vulnerabilities are at the foundation of much of today’s digital infrastructure, and the ability to quickly discover and exploit them is always a crucial advantage.
Ilkka Turunen, field chief technology officer at software company Sonatype, added that the industry is already moving in that direction, with AI contributing to an increase in both code generation and adversarial activity. “It’s not uncommon now to see AI-generated malware,” he said, adding that many of the current security findings are likely already aided by AI.
Systems like Mythos seem to further compress the timeline. Vulnerabilities can be identified, tested, and weaponized more quickly, reducing the time between discovery and exploitation. Turunen said this means “exploitation timelines will continue to get shorter, new vulnerabilities will be discovered and spread faster, and attacks will continue to be completely autonomous.”
Is Mythos really “too powerful to be released”?
The idea that Mythos was “too powerful” to be released was widespread soon after its release, but experts consulted with Live Science said it’s not that simple.
There are obvious risks. Systems that can quickly generate working exploits lower the barrier for attackers and make it easier to exploit vulnerabilities at scale. The risks are not theoretical. Anthropic’s own testing has shown that this model can already do this reliably and at high volumes. The work itself is not new. What stands out is that they all come together in one place and work together. This makes the entire process faster and easier to run end-to-end.
Chan argued that focusing solely on software-based controls is not enough to address that change. “The industry is making the same mistake over and over again: relying on the software layer to solve problems created within the software layer,” he said, adding that stronger protections are needed at the hardware level to prevent a complete system compromise.
The long-term impact of Mythos will likely depend less on the model itself and more on how quickly similar functionality becomes widely available.
Mr Warburton warned that the risk was not a single dramatic incident, but a gradual shift in the way we trust and use digital systems. “We are already seeing early signs that the internet will be shaped by automation,” he said, pointing to the increasing amount of machine-generated content and activity.
If systems like Mythos accelerate that trend, Warburton warned, the result could be an environment where both legitimate and malicious activity are increasingly driven by automated processes, making it difficult to distinguish between the two. At the same time, a large number of vulnerabilities are being discovered in the major systems we use every day, potentially outpacing our ability to fix them, especially as similar AI models begin to become more widely available.
Anthropic’s decision to keep Mythos within Glasswing puts Mythos in a controlled environment. Whether this continues will depend on how quickly comparable systems emerge elsewhere and how effectively the cybersecurity industry adapts to a world where the time between vulnerability emergence and exploitation continues to shrink.
Source link
