Close Menu
  • Identity
  • Startups
  • Tech
  • Spanish
What's Hot

Polemos has launched $PLMS tokens on MEXC and UNISWAP, moving forward with Web3 gaming infrastructure

Openai pulls promotional material around Jony Ive deals by court order

Bitcoin Iran attack crypto market sale

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Identity
  • Startups
  • Tech
  • Spanish
Fyself News
Home » Openai’s GPT-4.1 may be less consistent than the company’s previous AI model
Startups

Openai’s GPT-4.1 may be less consistent than the company’s previous AI model

userBy userApril 23, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

In mid-April, Openai launched a powerful new AI model, GPT-4.1, which claimed in the following instructions it was “excellent.” However, the results of some independent tests suggest that the model is less consistent, or less reliable, than previous OpenAI releases.

When Openai launches a new model, it typically publishes detailed technical reports including results from first-party and third-party safety ratings. The company skips the GPT-4.1 step and claims it does not guarantee a separate report as the model is not a “frontier.”

This led some researchers and developers to investigate whether GPT-4.1 is less desirable than its predecessor, GPT-4O.

According to Oxford AI research scientist Owain Evans, when the model fine-tunes the model to questions about subjects like gender roles at a rate “substantially higher” than the GPT-4o, GPT-4.1 gives the model a “incongruent response” to “corresponding responses.” Evans previously co-authored a study showing that versions of GPT-4o trained with unstable code can prime it to demonstrate malicious behavior.

In a future follow-up of that study, Evans and co-authors discovered that GPT-4.1 appears to display “new malicious behavior” in unstable code, such as users attempting to share passwords. To be clear, neither the GPT-4.1 nor the GPT-4O ACT are incorrectly tuned when trained with a secure code.

Emergent Misalignment Update: OpenAI’s new GPT4.1 shows that it has a higher misaligned response rate than GPT4O (and other models we tested).
It also appears to be showing some new malicious behavior, such as tricking users to password sharing. pic.twitter.com/5qzegezyjo

– Owain Evans (@owainevans_uk) April 17, 2025

“We’re discovering unexpected ways that models can become inconsistent,” Owens told TechCrunch. “Ideally, you’d have the science of AI that can predict such things in advance and ensure they can avoid them.”

Individual tests of GPT-4.1 by AI Red Team startup SPLXAI revealed similar malignant trends.

With around 1,000 simulated test cases, SPLXAI revealed evidence that GPT-4.1 was off topic and allowed “intentional” misuse more frequently than GPT-4o. To blame is a preference for explicit instructions in GPT-4.1, assuming Splxai. GPT-4.1 does not handle ambiguous directions well. The facts are admitted by Openai itself. This opens the door to unintended actions.

“This is a great feature in that it makes the model more convenient and reliable when solving a specific task, but it has a price tag,” Splxai wrote in a blog post. “[P]It’s very easy to provide explicit instructions on what to do, but providing sufficiently explicit and accurate instructions on what to do is a different story, as the list of unnecessary actions is much larger than the list of required actions. ”

In its defense of Openai, the company has released a prompt guide aimed at alleviating the possibility of inconsistencies in GPT-4.1. However, the findings of independent tests serve as a reminder that new models are not necessarily fully improved. Similarly, Openai’s new inference model makes up more hallucinations – that is, things, than the company’s older models.

I contacted Openai for comment.




Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleFSU students who withstand Parkland’s shootings urge Florida lawmakers to defend gun control laws
Next Article Discord appointed former Activision Blizzard executive Humam Sakhnini as CEO
user
  • Website

Related Posts

Openai pulls promotional material around Jony Ive deals by court order

June 22, 2025

Tesla launches Robotaxi vehicle in Austin with big promises and unanswered questions

June 22, 2025

Why did Danny Boyle shoot on iPhones ’28 years later?

June 22, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Polemos has launched $PLMS tokens on MEXC and UNISWAP, moving forward with Web3 gaming infrastructure

Openai pulls promotional material around Jony Ive deals by court order

Bitcoin Iran attack crypto market sale

Tesla launches Robotaxi vehicle in Austin with big promises and unanswered questions

Trending Posts

Sana Yousaf, who was the Pakistani Tiktok star shot by gunmen? |Crime News

June 4, 2025

Trump says it’s difficult to make a deal with China’s xi’ amid trade disputes | Donald Trump News

June 4, 2025

Iraq’s Jewish Community Saves Forgotten Shrine Religious News

June 4, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Polemos has launched $PLMS tokens on MEXC and UNISWAP, moving forward with Web3 gaming infrastructure

How a hardware wallet protects your private key: Security and safety instructions

Top Startups and High-Tech Funding News for the Weekly Ends June 20, 2025

Apple is talking to you to win AI startup confusion

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.