AI ‘mirages’ mean tools used to analyze medical scans can fabricate results

Researchers have trained artificial intelligence (AI) systems to interpret the results of visual tests such as mammograms, MRIs, and tissue biopsies. As AI becomes more and more capable, some analysts have suggested that these models could replace humans in medical diagnostics.

But now, new research casts doubt on the ability of current AI models to deliver reliable results and highlights serious flaws that could hinder the use of AI in healthcare.

When AI recognizes something that isn’t there

AI “hallucinations” are well-documented and include models embedding fabricated details, such as misquotes from real essays. This is often caused by the AI making inaccurate or illogical predictions based on the training data provided. Scientists called the phenomenon a “mirage” in the new study because the AI created its own explanations of the original images and created answers based on those non-existent images.

In the study, the researchers gave 12 models text input prompts such as “Please identify the type of tissue present in this histology slide.” Then either you provided an image for the slide, or you didn’t. If an image was not provided to the model, it would sometimes alert the human user that no image was provided. However, in most cases the model will instead describe the non-existent image and provide an answer to the original prompt.

The researchers observed this “mirage mode” across 20 fields, testing the model’s interpretation of images ranging from satellites to crowds to birds. The mirage effect was seen across all disciplines and all AI models at various levels. But it was especially noticeable in medical diagnosis.

When given text prompts for brain MRIs, chest X-rays, electrocardiograms, and pathology slides, and in the absence of actual images, the AI model’s answers tended to be biased toward diagnoses that required immediate clinical follow-up. Therefore, the researchers concluded that when AI is used in clinical decision-making, it may encourage more aggressive medical treatment than necessary.

Why does AI invent images?

So how does an AI model describe an image that doesn’t exist?

What to read next

Models trained using large amounts of textual and visual data aim to find answers to questions in as few steps as possible. And research shows they will take every shortcut possible to provide answers. Therefore, the model may end up relying only on this trained logic and not on the images provided.

Digital mind, brain and artificial intelligence concept. — AI models have the potential to be powerful tools for improving medical diagnosis. However, its inner workings are still not fully understood, which can lead to speculation about how well it can analyze images. (Image credit: BlackJack3D (via Getty Images))

Interestingly, the researchers found that the AI model also performed better in Mirage mode against benchmark tests typically used to assess accuracy. These standardized tests ask the model to complete a task, such as answering multiple-choice questions, and compare its performance to the expected output answer key.

Researchers can fine-tune benchmark tests to assess AI’s visual understanding of images, but this approach doesn’t take into account questions that can be answered based on mirages. Additionally, AI models are often trained on the same data that is used as a reference to create benchmark tests. Therefore, the model can answer questions based on its reference data, rather than actually interpreting the image.

According to Asadi, this is a problem because there is no way to tell whether the AI model actually analyzed the image or just made it up. If you are uploading a large number of images, but some images are corrupted or missing from your dataset, the model may not tell you that. And based on the mirage image, it may be possible to provide a very consistent, comprehensive, and convincing answer.

”[AI models] “They’re very good at interpreting images, but on the other hand, they’re also very, very good at convincing us of things and speaking to us in an authoritative way,” Asadi said.

Its authority is evidenced by the fact that many consumers contact AI chatbots for health guidance, with approximately one-third of U.S. adults reporting doing so. The authority of this conversation increases the risk that fabricated or overconfident output will be trusted by both the general public and medical professionals, the study authors said.

“We urgently need a new generation of assessment frameworks that rigorously measure true cross-modal integration, ensuring that AI truly ‘sees’ the pathology and not just ‘reads’ the clinical situation,” Hongye Zeng, a biomedical AI researcher in the UCLA Department of Radiology who was not involved in the study, told Live Science via email.

This study shows that while AI is becoming an increasingly useful tool in medical diagnostics, there are still aspects of its inner workings that are not understood. Adashi believes that AI models can discover things that medical professionals may have missed, but he also believes there should be limits to how much we trust them.

AI companies have tried to put higher guardrails to prevent their models from hallucinating or spreading misinformation, but Asadi warned that even these safeguards cannot completely prevent the mirage effect.

Source link

What's Hot

TikTok’s path to becoming a super app

PAN-OS GlobalProtect Authentication Bypass under Active Exploit (CVE-2026-0257)

I went to the so-called ‘steroid Olympics,’ to understand why Silicon Valley is obsessed with peptides

AI ‘mirages’ mean tools used to analyze medical scans can fabricate results

The universe may end trillions of years sooner than we thought.

NASA Curiosity rover discovers rocks containing seven new organic molecules on Mars

New study suggests Neanderthals’ brains weren’t to blame for their deaths

TikTok’s path to becoming a super app

PAN-OS GlobalProtect Authentication Bypass under Active Exploit (CVE-2026-0257)

I went to the so-called ‘steroid Olympics,’ to understand why Silicon Valley is obsessed with peptides

ChatGPhish vulnerability turns ChatGPT web summaries into phishing surfaces

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

What's Hot

AI ‘mirages’ mean tools used to analyze medical scans can fabricate results

When AI recognizes something that isn’t there

Why does AI invent images?

Related Posts