Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Researchers reveal Ecscape’s flaws in Amazon ECS that allow cross-task qualification theft

Upwork is buying its way to staffing companies beyond freelancers

Fake VPN and spam blocker apps associated with vextrio used in ad fraud, subscription scams

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » Confusing accused of scraping websites that explicitly blocked AI scraping
Startups

Confusing accused of scraping websites that explicitly blocked AI scraping

userBy userAugust 4, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

According to the Internet Infrastructure Provider CloudFlare, AI startups’ stumps are raw scraping content from websites that explicitly indicate they don’t want to be scraped away.

On Monday, CloudFlare published a survey that found that AI startups ignored blocks and observing their raw or scraping activities. The Network Infrastructure giant has accused them of obscuring their identity when trying to scrape web pages “to avoid website preferences,” CloudFlare researchers wrote.

AI products like Prplexity offer rely on gobbling large amounts of data from the Internet, and AI startups have repeatedly scraped text, images and videos from the Internet without permission to make the product work. Recently, the website has tried to fight back using the Web Standard Robots.txt file. It tries to tell search engines and AI companies whether they can index their efforts that they have seen a wide range of results.

According to CloudFlare, it appears they are willing to bypass these blocks by changing the “user agent” of the bot.

“This activity was observed across tens of thousands of domains and millions of requests per day. We were able to fingerprint this crawler using a combination of machine learning and network signals,” read CloudFlare’s post.

Perplexity spokesman Jesse Dwyer dismissed the CloudFlare blog post as “sales pitch” and added an email to TechCrunch that said it “indicates that the content was not accessed.” In a follow-up email, Dwyer insisted on the CloudFlare blog a bot named “Not us.”

CloudFlare said the action was first noticed after customers complained that they were baffled and raw and rubbed the site in distress, especially to block known bots in Prplexity. CloudFlare then ran tests to check and confirmed that the confusion was avoiding these blocks.

TechCrunch Events

San Francisco
|
October 27th-29th, 2025

“Perplexity observed that it uses not only declared user agents, but also a common browser that impersonates Google Chrome on MacOS when declared crawlers are blocked,” CloudFlare said.

The company also said it has created Perplexity bots from its verified list and added new techniques to block them.

CloudFlare has recently taken a public stance against AI Crawlers. Last month, CloudFlare announced the launch of a market that will allow website owners and publishers to claim AI scrapers to visit their sites. CloudFlare CEO Matthew Prince sounded the alarm at the time, saying that AI was breaking the internet, particularly the publisher’s business model. Last year, CloudFlare launched a free tool to prevent bots from shaking websites to train AI.

This is not the first time that confusion has been accused of rubbing without permission.

Last year, news outlets such as wired claim that confusion was plagiarizing their content. A few weeks later, Perplexity CEO Aravind Srinivas was unable to answer immediately when asked to provide a definition of plagiarism in an interview with Devin Coldewey of The TechCrunch at The Disrupt 2024 Conference.


Source link

#Aceleradoras #CapitalRiesgo #EcosistemaStartup #Emprendimiento #InnovaciónEmpresarial #Startups
Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleA dormant volcano erupts for the first time in Russia for the first time in about 500 years, a few days after a magnitude of 8.8 megakoki.
Next Article Hadrian’s Wall: The defensive Roman Wall that protected the British frontier for 300 years
user
  • Website

Related Posts

Upwork is buying its way to staffing companies beyond freelancers

August 6, 2025

Rivalry apps for men leak user personal data and driver’s license

August 6, 2025

Google’s Genie 3: The Dawn of General AI?

August 6, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Researchers reveal Ecscape’s flaws in Amazon ECS that allow cross-task qualification theft

Upwork is buying its way to staffing companies beyond freelancers

Fake VPN and spam blocker apps associated with vextrio used in ad fraud, subscription scams

Rivalry apps for men leak user personal data and driver’s license

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Google’s Genie 3: The Dawn of General AI?

FySelf, PODs, TwinH: Revolutionizing Digital Identity & Government Data Control

Beyond Zuckerberg’s Metaverse: TwinH Powers Digital Government with Berners-Lee’s New Internet Vision

The TwinH Advantage: Unlocking New Potential in Digital Government Strategies

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.