Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Microsoft Link Exploits to 3 Chinese Hacker Groups in SharePoint ongoing

Google and Microsoft say Chinese hackers are using SharePoint Zero-Day

Cisco checks active exploits targeting defects in ISE and allows for unrecognized root access

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » According to scientists, in a summary of scientific research, chatbots are shiny for critical details
Science

According to scientists, in a summary of scientific research, chatbots are shiny for critical details

userBy userJuly 5, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Large-scale language models (LLMs) are oversimplified with each new version, and in some cases misrepresent important scientific and medical findings, making them almost “intelligent” as new research is discovered.

Scientists have found that ChatGpt, Llama, and Deepseek versions are five times more likely to oversimplify scientific discoveries than human experts in their analysis of summaries of 4,900 research papers.

Given an accuracy prompt, chatbots were twice as likely to overly normalize their findings than when asked for a brief summary. The test also revealed an increase in excess generalization between new chatbot versions compared to previous generations.

You might like it

Researchers published their findings in a new study at the Royal Society Open Science on April 30th.

“One of the biggest challenges is that generalizations seem benign or useful until we realize that they are changing the meaning of the original study,” wrote Uwe Peters, a postdoctoral researcher at the University of Bonn in Germany, in an email to Live Science. “What we’re adding here is a systematic way to detect when a model generalizes beyond what is guaranteed in the original text.”

It’s like a copier with a broken lens that makes subsequent copies bigger and bolder than the original. LLMS filters information through a set of computational layers. Along the way, you can either lose some information or change the meaning in subtle ways. This is especially true for scientific research, as scientists need to frequently include qualifications, context, and limitations in their findings. It can be extremely difficult to provide a simple and accurate summary of the findings.

“While previous LLMs were more likely to avoid answering difficult questions, instead of rejecting responses, newer, larger, and more capable models produced authoritative but flawed responses that are often misleading,” the researchers wrote.

Get the world’s most engaging discoveries delivered straight to your inbox.

Related: AI is overconfident and as biased as humans, learning shows

In one example of this study, DeepSeek created medical recommendations in one summary by changing the phrase “a safe and effective treatment option.”

Another test in this study showed that the range of efficacy of drug treatment type 2 diabetes in young people by eliminating information about drug dosage, frequency, and efficacy.

If published, the summary generated by this chatbot could potentially allow a healthcare professional to prescribe the drug on a non-effective parameter.

Unsafe treatment options

In the new study, researchers worked to answer three questions about the 10 most popular versions of LLM: four versions of ChatGPT, three versions of Claude, two versions of Llama, and one of Deepseek.

When they were encouraged to present and summarise human summary of articles in the Academic Journal, LLM overstated the summary and wanted to see if seeking a more accurate answer would have better results. The team also sought to find out if LLMs are overdoing more than humans.

The findings revealed that LLMS except Claude, which worked well on all test criteria, was twice as likely to produce an overgeneralized result given a prompt for accuracy. The LLM summary could be nearly five times higher than the human-generated summary for rendering generalized conclusions.

The researchers also noted that LLMS is most common overcombined overloaded by quantified data to generic information, and is most likely to create unsafe treatment options.

These transitions and overgeneralization have led to bias, according to experts at the intersection of AI and healthcare.

“The study emphasizes that bias can also take a more subtle form, like quiet inflation in the scope of bias,” Max Rollwage, vice president of AI and Limbic, a clinical mental health AI technology company, told Live Science in an email. “In domains like medicine, LLM summary is already a routine part of the workflow. This makes it even more important to look at how these systems are run and whether the output is reliable to faithfully represent the original evidence.”

Such findings should encourage developers to create workflow guardrails that identify excessive replicating and omissions of critical information before leaving their findings in the hands of public or expert groups, Rollwage said.

Comprehensively, this study had limitations. Future research will benefit from expanding the test to other scientific tasks and non-English texts. It also benefits from testing which types of scientific claims are affected by overgeneralization, says Patricia Thaine, co-founder and CEO of AI development company.

Rollwage also said “deep, faster engineering analyses could have improved or clarified results,” and Peters sees greater risks on the horizon as their reliance on chatbots grows.

“Tools like ChatGpt, Claude, and Deepseek have become part of how people understand scientific discoveries,” he writes. “As their use continues to grow, this poses a real risk of massive misconceptions of science at a moment when public trust and scientific literacy are already under pressure.”

For other experts in this field, the challenge we face is ignoring specialized knowledge and protection.

“Models are trained in simplified scientific journalism, not or in addition to the primary sources of information inheriting those overstatements,” Thaine wrote in Live Science.

“However, what’s important is that we are applying a generic model to specialized domains without the supervision of appropriate experts. This is a fundamental misuse of technology that requires more task-specific training.”


Source link

#Biotechnology #ClimateScience #Health #Science #ScientificAdvances #ScientificResearch
Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleSpaceship carrying cannabis and human remains collide with the ocean
Next Article Japanese Quail: A Bird with Strange Sperm Form, Struts After Sex, A Bird with Spots of Universe History
user
  • Website

Related Posts

China launches the world’s first robot that can be run 24/7 – Watch it change your battery with new unstable footage

July 21, 2025

Shark Week team discovers the rare “Blackmacos” off the coast of California

July 21, 2025

3 hours from New York to Los Angeles? Executive Order could make it possible by 2027, allowing doors to be reopened for commercial supersonic flights

July 21, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Microsoft Link Exploits to 3 Chinese Hacker Groups in SharePoint ongoing

Google and Microsoft say Chinese hackers are using SharePoint Zero-Day

Cisco checks active exploits targeting defects in ISE and allows for unrecognized root access

Betaworks’ third fund will close at $66 million and invest in early stage AI startups

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Is ‘Baby Grok’ the Future of Kids’ AI? Elon Musk Launches New Chatbot

Next-Gen Digital Identity: How TwinH and Avatars Are Redefining Creation

BREAKING: TwinH Set to Revolutionize Legal Processes – Presented Today at ICEX Forum 2025

Building AGI: Zuckerberg Commits Billions to Meta’s Superintelligence Data Center Expansion

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.