Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

New Rowhammer Attack Variant Degrades AI Models on Nvidia GPUs

Xai and Grok apologise for “terrifying behaviour”

Over 600 laravel apps exposed to remote code execution due to app_keys leaked on github

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » Llama 4 scandal: Lama 4 meta released covered by fraudulent allegations in AI benchmark
Tech

Llama 4 scandal: Lama 4 meta released covered by fraudulent allegations in AI benchmark

userBy userApril 8, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Meta launched the much-exaggerated Llama four models last weekend, touting significant performance improvements and new multimodal features. However, the deployment is not progressing as planned. What was supposed to mark a new chapter in Meta’s AI playbook is now caught up in benchmark fraud, causing a wave of skepticism across the tech community.

Llama 4 hits the scene and then headlines

Meta has introduced three models: the Lama 4 Scout, the Lama 4 Maverick, and the Lama 4 Behemoth, who is still training. According to Meta, Scout and Maverick are already available to embrace Face and Llama.com and are integrated into Meta AI products across Messenger, Instagram Direct, WhatsApp and The Web.

The Scout is a compact 17B parameter model built with 16 experts and can be fitted to a single NVIDIA H100 GPU. Meta claims it surpasses the Mistral 3.1, Gemini 2.0 Flash-Lite and Gemma 3 in widely reported benchmarks. Another 17B parameter model, but 128 experts, Maverick is said to beat the GPT-4O and Gemini 2.0 Flash.

These models are the Lama 4 Behemoth, a 288B parameter model still in training. Meta says Behemoth is already outperforming GPT-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro in the range of stem benchmarks.

It’s all impressive. However, soon after its release, questions began to accumulate.

Benchmark issues

“We have developed a new training technique called metapping, which ensures that we can set important model hyperparameters such as layer-by-layer learning rates and initialization scales. Each and overall, and overall, multilingual tokens 10 times more than Lama 3,” Meta said in a blog post.

The Llama 4 Maverick was stronger among the two released models and was at the heart of the controversy. Meta introduced her performance at LM Arena, but critics noticed something strange. The versions tested were not the same as the published release. Meta was found to be using a custom tuned version for the benchmark, claiming the results were padded.

Ahmad Al-Dahle, vice president of Meta Generation AI, refused to play fouls. He said the company has not trained on test sets and the inconsistencies are merely platform-specific quirks. Still, the damage was done. Social media broke out, and the poster denounced Meta as “benchmark hacking” and manipulated the test conditions to make Lama 4 look stronger than that.

Under the accusation

An anonymous user who claims to be a former meta-engineer posted to a Chinese forum claims that the team behind Lama 4 adjusted the post-training dataset to get a better score. That post triggered the Firestorm on X and Reddit. The user has started connecting dots. Inconsistencies in internal tests, claiming that they put pressure on them to move forward from leadership despite known issues, and the general sentiment that optics were given priority in accuracy.

The term “Maverick’s Tactics” began to cycle through shorthand to test the headline-tracking protocol and play loosely.

Meta’s reaction and what is missing?

Meta addressed concerns in an April 7 interview with TechCrunch, calling the accusation false, standing by the benchmark. However, critics say the company doesn’t provide enough evidence to support its claims. There is no detailed methodology or white papers, and there is no access to raw test data. In an industry with increasing scrutiny, that silence has made things worse.

Why is it important?

Benchmarks are a big deal in AI. Help developers, researchers and companies compare neutral location models. However, the system is not bulletproof. You can overset the test set and massage the results. That’s why transparency is important. Without it, trust will erode quickly.

According to Meta, the Llama 4 offers “best in class” performance, but the chunks of community are not buying it now. And for companies that are putting a big bet on AI as the core pillar of their future, such doubts are difficult to shake up.

The whole picture

This isn’t just meta. There is growing concern that benchmark results are greater about marketing than science across the AI ​​space. The Llama 4 episode is the latest example of how a company is called if the numbers aren’t summed.

It is not yet known whether these charges are retained. For now, Meta’s statements are against a flood of speculation. The company has ambitious plans for the Lama 4, and the model itself could be solid. However, the rollout raised more questions than it answered. And those questions don’t go away until there’s more transparency.

Lama 4 could be a big win for Meta. Alternatively, it can be remembered as a launch that caused trust issues in another round of AI.

🚀Want to share the story?

Submit your stories to TechStartUps.com in front of thousands of founders, investors, PE companies, tech executives, decision makers and tech leaders.

Please attract attention


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDemocracy-led states push back Trump threat to cut school funding for DEI
Next Article Finally, Mrbeast places emphasis on the Du Jour, the global economic crisis
user
  • Website

Related Posts

ICEX Forum 2025 Opens: FySelf’s TwinH Showcases AI Innovation

July 10, 2025

The Future of Process Automation is Here: Meet TwinH

July 9, 2025

Robots Play Football in Beijing: A Glimpse into China’s Ambitious AI Future

July 7, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

New Rowhammer Attack Variant Degrades AI Models on Nvidia GPUs

Xai and Grok apologise for “terrifying behaviour”

Over 600 laravel apps exposed to remote code execution due to app_keys leaked on github

Sequoia bets on silence | TechCrunch

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

ICEX Forum 2025 Opens: FySelf’s TwinH Showcases AI Innovation

The Future of Process Automation is Here: Meet TwinH

Robots Play Football in Beijing: A Glimpse into China’s Ambitious AI Future

TwinH: A New Frontier in the Pursuit of Immortality?

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.