When CloudFlare secretly slashed its website on Monday, accusing AI search engines of bewilderment, ignoring certain ways of blocking it, this was not a clear case of AI Web Crawler becoming wild.
Many people came to defend their confusion. They argued that the confusion of visiting the site ignoring the wishes of the website owner is controversial, but acceptable. And this is a controversy that certainly grows as AI agents flood the internet. Should agents accessing the website on behalf of users be treated like bots? Or just as if humans are making the same requests?
CloudFlare is known for providing anti-bot crawl and other web security services to millions of websites. Essentially, CloudFlare test cases include setting up a new website with a new domain that doesn’t raw in the bot, and using a Robots.txt file setup. And confusion answered the question.
Researchers at CloudFlare have discovered that AI search engines use “a common browser intended to impersonate Google Chrome on MacOS.” “It’s likely that a ‘revered’ AI company will act like a North Korean hacker,” said Matthew Prince, CEO of CloudFlare. “Time to be embarrassed, embarrassed, to block hard,” he wrote.
However, many people opposed Prince’s assessment that this was a real bad behavior. Advocates for the perplexity of sites like X and Hacker News noted that what CloudFlare appears to have documented is that users will be visiting a specific public website when they ask about that particular website.
“If you request a website as a human, you need to view the content,” a hacker news person wrote, “Why can I access the website for me?
A spokesman for Perplexity previously denied TechCrunch that the bot belonged to the company, called CloudFlare’s blog post and CloudFlare’s sales pitch. Then, on Tuesday, Perplexity published a blog on defense (and generally attacks CloudFlare), claiming that its behavior was from a third-party service that it occasionally uses.
TechCrunch Events
San Francisco
|
October 27th-29th, 2025
But the heart of Perplexity’s post brought similar appeal to online defenders.
“The difference between automatic crawling and user-driven fetching is more than just technical. That’s who has access to information on the open web,” the post states. “This controversy reveals that CloudFlare’s systems are essentially insufficient to distinguish between legitimate AI assistants and real threats.”
The Perplexity accusations are also not accurately fair. One argument that Prince and Cloudflare used to call out the Perplexity method was that Openai doesn’t behave the same way.
“Openai is an example of a major AI company that follows these best practices,” CloudFlare writes. “They respect robots.txt and don’t try to avoid either robots.txt directives or network level blocks. The ChatGpt agent is signing HTTP requests using the newly proposed open standard webbot AUTH.”
Web Bot Auth is a standard supported by CloudFlare, developed by the Internet Engineering Task Force, which wants to create encryption methods to identify AI agent web requests.
The discussion is because bot activity is reshaping the internet. As TechCrunch previously reported, bots trying to rub a lot of content to train AI models have become a threat, especially for small sites.
For the first time in the history of the Internet, bot activity is now outweighing human activity online, with AI traffic accounting for more than 50%, according to a Bad Bot report released last month. Most of that activity comes from LLMS. However, the report also found that malicious bots now account for 37% of all internet traffic. This includes activities that include everything from permanent scraping to unauthorized login attempts.
Up to LLMS, the Internet has generally admitted that websites can block most bot activity, given that it was malicious by using Captchas and other services (such as CloudFlare). The website also had clear incentives to work with certain good actors like GoogleBot to guide you on things that are not indexed via robots.txt. Google indexed the internet and sent traffic to the site.
Currently, LLM is increasing its traffic volume. Gartner predicts search engine volume will decline by 25% by 2026. Currently, humans tend to click on website links from LLMS at the most valuable point for a website.
But if humans employ agents as the tech industry predicts – arrange travel, book dinner reservations, shop for us – will the website hurt business profits by blocking them? The X discussion captured the dilemma completely:
“When providing requests/tasks, I want to be confused about visiting public content on its behalf!” wrote one in response to CloudFlare evoking confusion.
“What if the site owner doesn’t want that? They just want you [to] Visit the house in person and see what they are,” another insisted, noting that the owner of the site that created the content wanted traffic and potential advertising revenue to avoid robbing the confusion.
“This is why you can’t see ‘agent browsing’ really working. It’s a much more difficult problem than people think. Most website owners just block,” predicted the third.
Source link