Close Menu
  • Start
  • Celebrities
  • Music
  • Influencers
  • Tendencies
  • Exclusives
  • Business & Brands
  • TwinH
  • Spanish
What's Hot

Bonnie Tyler has recovered from coma but remains ‘very unwell’ after emergency surgery

Choose a new language (or 25 languages) with this $127 Rosetta Stone sale

Jelly Roll files for divorce from Bunny XO after 10 years of marriage

Facebook X (Twitter) Instagram
  • Home
  • About The FYMOUS
  • Advertising / Promotion
  • Contact
  • DMCA
  • Privacy Policy
  • Terms
  • Publish News
Facebook X (Twitter) Instagram
FYMOUS News
  • Start
  • Celebrities
  • Music
  • Influencers
  • Tendencies
  • Exclusives
  • Business & Brands
  • TwinH
  • Spanish
FYMOUS News
Home » Meta Exec rejects the company’s artificially boosted benchmark score for Llama4
Exclusives

Meta Exec rejects the company’s artificially boosted benchmark score for Llama4

By April 7, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

On Monday, the meta-executive denied rumours that they had adjusted new AI models suitable for specific benchmarks, hiding the weaknesses of the model.

Ahmad al-Dar, vice president of Meta Generation AI, said in X’s post that Meta trained the Rama 4 Maverick and the Rama 4 Scout model in the “test set.” In AI benchmarks, a test set is a collection of data used to evaluate performance after the model has been trained. Training on a test set can mislead and inflate the model’s benchmark scores, which can make the model more capable than it actually is.

Over the weekend, unfounded rumors began to circulate on X and Reddit that Meta artificially increased the benchmark results of the new model. The rumor appears to have stemmed from a post on a Chinese social media site from users who claimed they had resigned from Meta in protest of the company’s benchmark practices.

Maverick and Scout have driven rumors as reports of poor performance on certain tasks. This promoted rumors, as well as Meta’s decision to use an experimental and unpublished version of Maverick to achieve better scores at the benchmark LM arena. X researchers have observed significant differences in the behavior of publicable Mavericks compared to models hosted at LM Arena.

Al-Dahle has admitted that some users see “mixed quality” from Maverick and Scouts at various cloud providers that host the models.

“We dropped as soon as the model was ready, so we expect it will take several days for all public implementations to be dialed,” says Al-Dahle. “We continue to work through bug fixes and onboarding partners.”


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleFrom .ai to .com: Quiet Domain Rebranded Sweep Startup Ecosystem
Next Article MSF finds malnourished children in Greek immigration camps and encourages action | Transition News

Related Posts

Choose a new language (or 25 languages) with this $127 Rosetta Stone sale

June 16, 2026

Best Robot Lawn Mower Deal: 45% Off Sunseeker S4 Robot Lawn Mower

June 15, 2026

Social media reacts to Knicks’ storied NBA Finals win

June 14, 2026
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Bonnie Tyler has recovered from coma but remains ‘very unwell’ after emergency surgery

Choose a new language (or 25 languages) with this $127 Rosetta Stone sale

Jelly Roll files for divorce from Bunny XO after 10 years of marriage

Merlin, a common roadside duck in Mexico City, will be the World Cup mascot.

Trending Posts

Bonnie Tyler has recovered from coma but remains ‘very unwell’ after emergency surgery

June 16, 2026

Jelly Roll files for divorce from Bunny XO after 10 years of marriage

June 16, 2026

BTS is the group fans are most looking forward to seeing perform at the 2026 World Cup

June 15, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to The FYMOUS, a modern digital media platform dedicated to celebrities, artists, influencers, brands, entertainment culture, and the growing TwinH ecosystem.

We bring audiences closer to the people, stories, trends, and collaborations shaping today’s culture. From exclusive celebrity news and music releases to influencer highlights, brand partnerships, and TwinH activations, The FYMOUS delivers engaging content designed for the next generation of digital audiences.

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About The FYMOUS
  • Advertising / Promotion
  • Contact
  • DMCA
  • Privacy Policy
  • Terms
  • Publish News
© 2026 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.