Most Americans say they don’t trust artificial intelligence (AI), but researchers have discovered a surprising new metric that suggests the opposite. In other words, people are more likely to buy something after reading an AI-generated online review summary than a human-written online review. However, the AI hallucinated 60% of the time when asked about the product.
The team at the University of California, San Diego (UDSD) claims this is the first study to show that the cognitive biases introduced by large-scale language models (LLMs) have a real impact on user behavior. It also states that it is the first project to quantitatively measure the impact of AI on people.
Article continues below
you may like
First, the scientists asked the AI to summarize product reviews and media interviews, then asked the AI to fact-check new statements to see if they were true. In the second task, the AI was shown both a description of a news article and a doctored version of the same description that was also subjected to fact-checking.
“The consistently low precision of exact accuracy compared to the accuracy of real and fabricated news highlights a significant limitation in that it remains impossible to reliably distinguish between fact and fabrication,” the scientists said in the study.
The most shocking finding concerned online product reviews. Participants were much more likely to be interested in purchasing a product after reading an AI-generated product summary than after reading a product summary written by a human reviewer.
Distorted consumer judgment
The researchers proposed two reasons why people are more likely to make purchases based on AI summaries. First, LLM tends to concentrate at the beginning of the input text, a phenomenon known as “lost in the middle.” Lead author Abeer Alessa, a research assistant and lecturer in machine learning and human-computer interaction, has noted this in previous research.
Second, LLM is less reliable when processing information that is not included in the training data.
“Models tend to be wrong about whether the news description happened or not,” Alessa told Live Science in an interview. “Even if the event occurs after the model has finished training, it may be incorrectly stated that the event did not occur.”
During testing, the team found that the chatbot changed the sentiment of real users’ reviews in 26.5% of cases and hallucinated 60% of the time when users asked questions about reviews.
What to read next
In this project, 70 subjects were assigned to read original reviews of common consumer products or chatbot-generated summaries of reviews, selecting examples of product reviews with either very positive or very negative conclusions. People who read the original review were 52% likely to buy the product, while people who read the AI-generated summary were 84% likely to buy the product.
This project used six LLMs. 1,000 reviews on electronics. 1,000 media interviews. News database of 8,500 items. They measured bias by quantifying changes in framing of content sentiment, overreliance on the initial text of the sample, and hallucinations.
When participants read a summary of a positive product review, they reported buying the product 83.7% of the time, compared to 52.3% when they read the original review.
The scientists concluded that even subtle changes in framing can significantly distort consumer judgment and purchasing behavior.
The authors acknowledged that their test was set up in a low-risk scenario, but warned that the effects could be more extreme in high-risk situations.
“High-stakes scenarios include summaries of medical documents and summaries of a student’s profile upon admission to school,” Alessa said. “In these situations, a change in framework can affect how people and events are perceived.”
In a further statement, the research team said that this paper is a step towards careful analysis and mitigation of content modification caused by LLM in humans, and provides insight into its impact. They said it could reduce the risk of systemic bias in areas such as media, education and public policy.
Quantifying cognitive bias induction in LLM-generated content, Alessa et al., IJCNLP-AACL 2025
Source link
