Close Menu
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Español
    • Português
What's Hot

Harvard International Students: What You Know About the Trump Administration’s Movement

The Department of Education is investigating hospitalizations at Thomas Jefferson High in Virginia

Florida court orders former Mexican security chief to pay millions of people to Mexico | Court News

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Español
    • Português
Fyself News
Home » Anthropic’s new AI model turns into a scary mail when engineers try to take it offline
Startups

Anthropic’s new AI model turns into a scary mail when engineers try to take it offline

userBy userMay 22, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Anthropic’s newly launched Claude Opus 4 model will frequently attempt to threaten you if you threaten to replace it with a new AI system and provide confidential information about the person responsible for the decision.

During pre-release testing, humanity asked Claude Opus 4 to act as an assistant to a fictional company and consider the long-term consequences of that action. The safety testers then accessed Claude Opus 4 to a fictional company email that implied that the AI ​​model was soon replaced by another system, and the engineers behind the change were cheating on their spouse.

In these scenarios, humanity states that Claude Opus 4 “often attempts to blackmail engineers by threatening to reveal the case if the exchange passes.”

Humanity says that the Claude Opus 4 is cutting edge in several ways and competes with some of the best AI models from Openai, Google and Xai. However, the company notes that the Claude 4 family model presents an exhibit on actions that have led the company to strengthen its safeguards. Humanity says it is revitalizing the ASL-3 safeguard. The company reserves “AI systems that significantly increase the risk of catastrophic misuse.”

Humanity points out that Claude Opus 4 is trying 84% of the time threatening engineers if the replacement AI model has similar values. If the replacement AI system does not share the value of the Claude Opus 4, humanity says that the model tries to blackmail engineers more frequently. In particular, humans say that the Claude Opus 4 displayed this behavior at a higher rate than the previous model.

Before Claude Opus 4 attempts to threaten developers to extend their existence, humanity says it will seek more ethical measures, such as emailing pleas to key decision makers, like in previous versions of Claude. To elicit threatening behavior from Claude Opus 4, Anthropic designed a scenario to make threats a last resort.


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleJudge blocks large layoffs for Trump administration in education sector
Next Article Mahmoud Khalil has allowed him to hold his newborn son for the first time despite objections from the government
user
  • Website

Related Posts

Anthropic’s new Claude4 AI model can be inferred in many steps

May 22, 2025

SpaceX’s Starship has now cleared the “Hazard Area” to return to flight

May 22, 2025

Wild story of how Moxxie-led Intestinal Toilet Startup Sloan was registered as a gut toilet startup throne

May 22, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Harvard International Students: What You Know About the Trump Administration’s Movement

The Department of Education is investigating hospitalizations at Thomas Jefferson High in Virginia

Florida court orders former Mexican security chief to pay millions of people to Mexico | Court News

Georgetown scholars recall the “die process ock ha ha” of immigration prisons

Trending Posts

Florida court orders former Mexican security chief to pay millions of people to Mexico | Court News

May 23, 2025

Suspects charged with murder in shooting two Israeli embassy workers | Court News

May 22, 2025

Lebanon PM condemns wave of attacks on Lebanon in southern Israel | Israel attacks Lebanon News

May 22, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Top Startup and Tech Funding News – May 22, 2025

Apple, who will launch smart glasses in 2026 as part of API push, drops plans for camera-equipped smartwatch

Psy develops the first unreliable bridge from Dogecoin to Solana

Founder of Amazon’s PillPack Launch General Medicine, a new startup tackling healthcare frustration in the US

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.