
Agent web browsers that leverage artificial intelligence (AI) capabilities to autonomously perform actions across multiple websites on your behalf can be trained and tricked into falling prey to phishing and fraud traps.
The core of the attack exploits the tendency of AI browsers to infer their behavior and use it against the model itself to lower security guardrails, Guardio said in a report shared with The Hacker News ahead of publication.
“AI now operates in real-time within messy, dynamic pages, all the while continuously requesting information, making decisions, and narrating its actions along the way. Well, ‘narrating’ is quite an understatement. Talky, way too much!” said security researcher Shaked Chen.
“This is what we call agent chatter. The AI browser exposes what it sees, what it believes is happening, what it plans to do next, and any signals it deems suspicious or safe.”
Guardio says it was able to make Perplexity’s Comet AI browser fall victim to a phishing scam within four minutes by intercepting traffic between the browser and an AI service running on the vendor’s servers and feeding it as input to a generative adversarial network (GAN).
The research builds on prior technologies such as VibeScamming and Scamlexity, which found that Vibecoding platforms and AI browsers can be guided to generate fraudulent pages or perform malicious actions via hidden prompt injection. In other words, with AI agents handling tasks without continuous human supervision, the attack surface has changed and scams no longer need to fool users. Rather, it aims to trick the AI model itself.
“If we can observe what agents are marking as suspicious, what they are hesitant about, and more importantly, what they are thinking and saying about the page, we can use that as a training signal,” Chen explained. “The scam evolves until an AI browser reliably falls into a trap set by another AI.”

The idea, in a nutshell, is to build a “fraud machine” that repeatedly optimizes and regenerates phishing pages until the agent browser stops complaining and starts doing the threat actor’s bidding, such as entering the victim’s credentials into a fake web page designed to carry out refund fraud.
What makes this attack interesting and dangerous is that once the fraudster iterates on a web page until it works for a particular AI browser, it will work for all users relying on the same agent. In other words, the target has moved from human users to AI browsers.
“This reveals the unfortunate near future we face: fraud will not just be launched and tuned in the wild, it will be trained offline against precise models that millions of people rely on, and it will work perfectly on first contact,” Guardio said. “Because explaining why the AI browser stopped tells the attacker how to get around it.”
The disclosure comes as Trail of Bits demonstrated four prompt injection techniques against the Comet browser, exploiting the browser’s AI assistant to extract users’ personal information from services like Gmail and exfiltrating the data to the attacker’s servers when users request an overview of web pages under their control.
Last week, Zenity Labs also detailed two zero-click attacks affecting Perplexity’s Comet. This attack uses indirect prompt injection seeded within a meeting invitation to either exfiltrate local files to an external server (also known as PerplexedComet) or hijack a user’s 1Password account if the password manager extension is installed and unlocked. These issues were collectively codenamed PerplexedBrowser and were subsequently addressed by AI companies.
This is accomplished through a prompt injection technique called intent collision, which occurs when “an agent merges an innocuous user request and attacker-controlled instructions from untrusted web data into a single execution plan without a reliable way to distinguish between the two,” security researcher Stav Cohen said.
Prompt injection attacks remain a fundamental security challenge for large-scale language models (LLMs) and their integration into organizational workflows. This is primarily because completely eliminating these vulnerabilities may not be feasible. In December 2025, OpenAI stated that such weaknesses are “unlikely” to be fully resolved in agent browsers, but the associated risks could be mitigated through automated attack detection, adversarial training, and new system-level safeguards.
Source link
