Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Protecting data in the AI ​​era

Critical Wing FTP Server Vulnerability (CVE-2025-47812)

Iran-backed Pay2key ransomware resurfaces

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » Meta’s Vanilla Maverick AI Model ranks under rivals in the popular chat benchmark
Startups

Meta’s Vanilla Maverick AI Model ranks under rivals in the popular chat benchmark

userBy userApril 11, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Earlier this week, Meta landed in warm water to achieve a high score on the crowdsourced benchmark LM Arena using an experimental, unpublished version of the Llama 4 Maverick model. The incident prompted LM Arena maintainers to apologise, change their policies and acquire the unchanged vanilla maverick.

After all, it’s not very competitive.

The unfixed Maverick “Llama-4-Maverick-17B-128e-Instruct” was ranked under the models that included As As As As Friday’s As As As As As As Andopenai’s GPT-4O, Openai’s GPT-4O, and Google’s Gemini 1.5 Pro. Many of these models were a few months ago.

The release version of Llama 4 was added to Lmarena after it was discovered they had cheated, but you probably didn’t see it as you have to scroll to 32nd place.

– ρ:eeσn (@pigeon__s) April 11, 2025

Why is the performance poor? Meta’s experimental Maverick, Lama-4-Maverick-03-26-Experimmal, explained in a chart released last Saturday. These optimizations clearly worked well for LM arenas where human evaluators compare the outputs of the models and select what they like.

As I wrote before, for a variety of reasons, LM arena was not the most reliable measure of AI models’ performance. Still, tuning your model to your benchmark is not only misleading, but it also makes it difficult for developers to accurately predict how well a model will work in different contexts.

In a statement, a Meta spokesperson told TechCrunch that Meta will experiment with “all kinds of custom variants.”

“‘llama-4-maverick-03-26-Experimmal’ is a chat-optimized version that also works well in the LM arena,” the spokesman said. “We are currently releasing an open source version and see how developers can customize Llama 4 for their use cases. We look forward to seeing what they build and ongoing feedback.”




Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMan City vs Crystal Palace: Premier League – Team News, Start Times, Streams | Food News
Next Article “Very nasty”: Judges Question Why We Can’t Find People Who Have Been Ejected Out | Donald Trump News
user
  • Website

Related Posts

Grok 4 appears to be consulting with Elon Musk to answer controversial questions

July 11, 2025

AWS will launch AI Agent Marketplace next week with humanity as partners

July 10, 2025

Runway co-founder Alejandro Matamala Ortiz will win the AI ​​stage in 2025

July 10, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Protecting data in the AI ​​era

Critical Wing FTP Server Vulnerability (CVE-2025-47812)

Iran-backed Pay2key ransomware resurfaces

EU material recovery rules to enhance waste batteries recycling

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

ICEX Forum 2025 Opens: FySelf’s TwinH Showcases AI Innovation

The Future of Process Automation is Here: Meet TwinH

Robots Play Football in Beijing: A Glimpse into China’s Ambitious AI Future

TwinH: A New Frontier in the Pursuit of Immortality?

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.