Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Amazon imposes ‘fuel surcharge’ on sellers as global energy market turmoil due to Iran war

Artemis II is NASA’s last lunar mission without Silicon Valley

Hackers exploit CVE-2025-55182 to compromise 766 Next.js hosts and steal credentials

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » ‘That’s not the way to build a digital mind’: How failures in reasoning are preventing AI models from achieving human-level intelligence
Science

‘That’s not the way to build a digital mind’: How failures in reasoning are preventing AI models from achieving human-level intelligence

By April 2, 2026No Comments6 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Architectural limitations of today’s most popular artificial intelligence (AI) tools may limit their ability to become more intelligent, new research suggests.

A study published on February 5th on the preprint arXiv server argues that modern large-scale language models (LLMs) are inherently prone to breakdowns in problem-solving logic known as “reasoning failures.”

Reasoning failures occur when an LLM loses sight of critical information needed to reliably solve a task, resulting in inaccurate answers to seemingly simple problems. Published as a review of existing research, the paper focused specifically on transformer models, a type of neural network architecture that powers popular AI chatbots such as ChatGPT, Claude, and Google Gemini.

you may like

Based on LLM’s performance in assessments such as the Last Humanity Test, some scientists say the underlying neural network architecture could one day lead to models that can reach human-level cognition. While the transformer architecture makes LLM extremely capable at tasks such as language generation, the researchers argue that it also hinders the reliable logical processes needed to achieve true human-level reasoning.

“LLM demonstrated remarkable reasoning abilities and achieved impressive results across a wide range of tasks,” the researchers wrote in the study. “Despite these advances, significant reasoning failures continue and occur even in seemingly simple scenarios…This failure stems from a lack of overall planning and deep thinking.”

LLM limitations

LLM is trained on vast amounts of text data and generates responses to user prompts by predicting plausible answers word by word. It does this by stringing together units of text called “tokens” based on statistical patterns learned from training data.

Transformers also use a mechanism called “self-attention” to track relationships between words and concepts over long text strings. The combination of self-attention and their large training databases makes modern chatbots very good at generating convincing answers to user prompts.

Get the world’s most fascinating discoveries delivered straight to your inbox.

However, the LLM does not involve any actual “thinking” in the traditional sense. Instead, the response is determined by an algorithm. For long tasks, especially those that require serious problem solving over multiple steps, the transformer can lose track of important information and default to patterns learned from training data. As a result, the inference fails.

That’s not real reasoning in the human sense. Still, it’s just a prediction of the next token disguised as a chain of thoughts.

Federico Nanni, Senior Research Data Scientist, Alan Turing Institute

“This fundamental weakness extends beyond basic tasks to constructing math problems, testing multiple factual claims, and other tasks that are compositional in nature,” the researchers wrote in their study.

Reasoning failures are also why LLMs circle the same answer even after being told that a user’s query is wrong, or generate different answers when the same question is worded slightly differently, even when asked to explain their reasoning step-by-step.

What to read next

Federico Nanni, a senior research data scientist at Britain’s Alan Turing Institute, argues that what LLMs typically present as inference is mostly window dressing.

“We found that people often got the correct answer when we asked them to ‘think step by step’ and write out their reasoning process first, rather than answering the LLM directly,” Nani told Live Science. “But it’s a trick. It’s not real reasoning in the human sense. It’s still just a prediction of the next token disguised as a chain of thoughts,” he says. “When we refer to these models as ‘reasons’, what we really mean is that they write out a process of inference – something that sounds like a chain of plausible inferences.”

Gaps in existing AI benchmarks

Researchers found that current methods of evaluating LLM performance fall short in three key areas. First, changing the wording of the prompt can affect the results. Second, benchmarks degrade and become contaminated the more they are used. And finally, evaluate only the results, not the reasoning process the model used to reach its conclusion.

This means that current benchmarks may significantly exaggerate the capabilities of LLM and underestimate how often LLM fails in real-world use.

Artificial intelligence expressed in digital circuits and advanced algorithms in a high-tech environment showcases modern technological advances and innovations.

LLM’s performance may mean that it has limited real-world applications. (Image courtesy: da-kuk/Getty Images)

“Our position is not that benchmarks are flawed, but rather that they need to evolve,” study co-author Peiyan Song, a computer science and robotics student at Caltech, told Live Science in an email. Similarly, benchmarks tend to leak into LLM training data, Nanni said, meaning subsequent LLMs find ways to fool the benchmarks.

“Plus, now that the models are deployed in production, the usage itself becomes a kind of benchmark,” Nanni says. “You put the system in front of the users and see what goes wrong. That’s the new test. So, yes, we need better benchmarks, and we need to rely less on AI to check AI. But that’s actually very difficult, because these tools are now embedded in the way we work, and just using them is very useful.”

A new architecture for AGI?

Unlike other recent studies, this new study does not claim that the neural network approach to AI is a dead end in our quest to achieve artificial general intelligence (AGI). Rather, the researchers liken this to the early days of computing and point out that understanding why LLMs fail is the key to improving them.

However, they argue that simply training or scaling up a model with more data is unlikely to solve the problem on its own. This means that developing AGI may require a fundamentally different approach to how models are built.

“Neural networks, and LLM in particular, are clearly part of the AGI picture, and the progress they’ve made is incredible,” said Song. “However, our research suggests that scaling alone is unlikely to resolve all inference failures… [meaning] Reaching human-level reasoning will likely require architectural innovations, stronger world models, improved robustness training, and deeper integration of structured reasoning with embodied interaction. ”

Nanni agreed. “From a philosophy of mind perspective, we can basically say that we have discovered the limits of transformers. Transformers are not the way to build a digital mind,” he said. “They model text so well that it’s almost impossible to tell whether the text was written by a human or a machine. But that’s what language models are… You can only push this architecture so far.”


Source link

#Biotechnology #ClimateScience #Health #Science #ScientificAdvances #ScientificResearch
Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleChemical recycling process transforms acrylic plastic recovery
Next Article The State of Trusted Open Source Report

Related Posts

Chinese satellite equipped with robotic Octopus arm passes critical refueling test in orbit – increasing chances of extending the lifespan of space assets

April 2, 2026

Scientists cured type 1 diabetes in mice by creating a mixed immune system

April 2, 2026

Native Americans invented dice and games of chance more than 12,000 years ago, archaeological research reveals

April 2, 2026
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Amazon imposes ‘fuel surcharge’ on sellers as global energy market turmoil due to Iran war

Artemis II is NASA’s last lunar mission without Silicon Valley

Hackers exploit CVE-2025-55182 to compromise 766 Next.js hosts and steal credentials

Chinese satellite equipped with robotic Octopus arm passes critical refueling test in orbit – increasing chances of extending the lifespan of space assets

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2026 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.