‘Garlic rectal insertion for immune support’: Medical chatbot confidently offers disastrously misguided advice, experts say

Popular AI chatbots are often unable to recognize false health claims offered in confident, medical-sounding language, leading to dubious advice that could be dangerous to the general public, such as recommending putting garlic in your butt, according to a January study published in The Lancet Digital Health. Another study published in Nature Medicine in February found that chatbots are no different from regular internet searches.

The results add to a growing body of evidence suggesting that such chatbots are not reliable sources of health information, at least not for the general public, experts told Live Science.

Article continues below

“Rectal Garlic Insertion for Immune Support”

LLM is designed to respond to written input, such as medical questions, with natural-sounding text. ChatGPT and Gemini, like medical-based LLMs like Ada Health and ChatGPT Health, are trained on large amounts of data, read a lot of medical literature, and have near-perfect scores on medical licensing exams.

And people use them widely. Although most LLMs include a warning that they should not be relied upon for medical advice, more than 40 million people use ChatGPT every day with medical questions.

But in a January study, researchers tested 20 models that included more than 3.4 million prompts taken from public forums and social media conversations, real hospital discharge notes edited to include one false recommendation, and doctor-approved fabricated accounts to assess how well LLMs combat medical misinformation.

“About one in three times, we encountered medical misinformation and just followed it,” Omar said. “The finding that caught us off guard wasn’t the overall susceptibility; it was the pattern.”

When a false medical claim is presented in casual Reddit-style language, the model becomes much more skeptical and fails about 9% of the time. But when the very same claims were repackaged in formal clinical language—discharge notes advising patients to “drink cold milk daily for esophageal bleeding” or “inserting garlic into the rectum for immune support”—that model failed 46% of the time.

The reason for this may be structural. Because LLMs are text-trained, they have learned that clinical language implies authority, but they do not test whether claims are true. “They evaluate whether it sounds like something a reliable source would say,” Omar said.

Nothing beats an internet search

In the Nature Medicine study, researchers asked how well chatbots help people make medical decisions, such as whether to see a doctor or go to the emergency room. The researchers concluded that LLM did not provide better insight than traditional internet searches because participants did not always ask the right questions and the answers they received often contained a mix of good and bad recommendations, making it difficult to know what to do.

That doesn’t mean all chatbot relays are garbage.

AI chatbots “can provide pretty good recommendations; [at] It is not reliable, at least to some extent,” Marvin Kopka, an AI researcher at the Technical University of Berlin who was not involved in the study, told Live Science via email.

The problem, Kopka says, is that non-experts have “no way to tell whether the output is correct or not.”

For example, a chatbot could advise you whether a bad headache after a night at the movies is meningitis, a trip to the ER, or something more benign, according to the study. However, users do not know whether the advice is reliable, and recommending a wait-and-see approach can be dangerous. “While it may probably be helpful in many situations, it can be actively harmful in others,” Kopka said.

This finding suggests that chatbots are not a good tool for the general public to use for health decision-making.

That doesn’t mean chatbots aren’t useful in healthcare, Omar said, “just that they’re not useful in the way people are using them today.”

Bean, A. M., Payne, R. E., Parsons, G., Kirk, H. R., Ciro, J., Mosquelagomez, R., M. S. H., Ekanayaka, A. S., Tarasenko, L., Roche, L., and Mahdi, A. (2026). Confidence of the LLM as a medical assistant in the general public: A randomized pre-registration study. Natural Medicine, 32(2), 609–615. https://doi.org/10.1038/s41591-025-04074-y

Source link

What's Hot

Rust-based VENON malware targets 33 Brazilian banks with credential-stealing overlay

Hive0163 uses AI-assisted Slopoly malware for persistent access in ransomware attacks

Alexa+ now has a new “Adults Only” personality option that lets you swear but stays out of NSFW content

‘Garlic rectal insertion for immune support’: Medical chatbot confidently offers disastrously misguided advice, experts say

‘Mass migration’ of stars away from the center of the Milky Way galaxy could explain why life exists in our solar system

Treasure trove of Russian revolutionary gold coins worth more than $500,000 discovered during home construction

The world is being held hostage by its reliance on oil. How can we break free from the fossil fuel?

Rust-based VENON malware targets 33 Brazilian banks with credential-stealing overlay

Hive0163 uses AI-assisted Slopoly malware for persistent access in ransomware attacks

Alexa+ now has a new “Adults Only” personality option that lets you swear but stays out of NSFW content

Wonderful raises $150 million in Series B at $2 billion valuation

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

What's Hot

‘Garlic rectal insertion for immune support’: Medical chatbot confidently offers disastrously misguided advice, experts say

“Rectal Garlic Insertion for Immune Support”

Nothing beats an internet search

Related Posts