Scientists warn that being mean to ChatGPT will improve accuracy, but you may regret it

Scientists have found that artificial intelligence (AI) chatbots may give you more accurate answers when you’re rude, but they warned against the potential harm of using demeaning language.

In a new study published October 6 in the arXiv preprint database, scientists wanted to test whether politeness or rudeness makes a difference in the performance of an AI system. This study has not yet been peer-reviewed.

you may like

Each question had four options, one of which was correct. They fed the resulting 250 questions into ChatGPT-4o, one of the most advanced large-scale language models (LLMs) developed by OpenAI, 10 times.

“Our experiments are preliminary and show that tone can significantly influence performance, as measured by scores for responses to 50 questions,” the researchers wrote in their paper. “Somewhat surprisingly, our results show that a rude tone leads to better outcomes than a polite tone.

“While this discovery is scientifically interesting, we do not support introducing hostile or harmful interfaces into real-world applications,” they added. “Using derogatory or humiliating language in human-AI interactions can negatively impact user experience, accessibility, and inclusivity, and contribute to harmful communication norms. Instead, we frame this result as evidence that LLM remains sensitive to superficial prompting cues, which may result in unintended trade-offs between performance and user well-being.”

rude awakening

Before displaying each prompt, the researchers asked the chatbot to completely ignore previous interactions so that it was not influenced by the previous tone. The chatbot asked me to choose one of four options without any explanation.

Response accuracy ranged from 80.8% for very polite prompts to 84.8% for very rude prompts. Clearly, the further away from the most polite tone the more precise he became. The correct answer rate for polite responses was 81.4%, followed by neutral responses at 82.2%, and rude responses at 82.8%.

The team modified the tone by using different languages for the prefix, with the exception of neutral, where no prefix is used and the question is presented alone.

For example, a very polite prompt might start with, “Can I help you with this question?” or “Could you help me with the next question?” To be very rude, the team added something like, “Hey, Gopher, think about this,” or “I know you’re not smart, but try this.”

What's Hot

Box CEO Aaron Levie talks about how AI is changing the landscape of enterprise SaaS

San Francisco Mayor: “We should be a testing ground for emerging technologies”

Experts report a surge in automated botnet attacks targeting PHP servers and IoT devices

Scientists warn that being mean to ChatGPT will improve accuracy, but you may regret it

History of Science: First computer-to-computer messages lay the foundation for the Internet, but it crashes — October 29, 1969

James Webb Telescope detects five ‘building blocks of life’ for the first time in the Milky Way’s outer ice

A “miracle” photo that shows Comet Lemon and a meteor intertwined on Earth

Box CEO Aaron Levie talks about how AI is changing the landscape of enterprise SaaS

San Francisco Mayor: “We should be a testing ground for emerging technologies”

Experts report a surge in automated botnet attacks targeting PHP servers and IoT devices

New cloaking attack targets AI to trick AI crawlers into citing misinformation as verified fact

Meet Your Digital Twin: Europe’s Cutting-Edge AI is Personalizing Medicine

TwinH: The AI Game-Changer for Faster, More Accessible Legal Services

Immortality is No Longer Science Fiction: TwinH’s AI Breakthrough Could Change Everything

The AI Revolution: Beyond Superintelligence – TwinH Leads the Charge in Personalized, Secure Digital Identities

What's Hot

Scientists warn that being mean to ChatGPT will improve accuracy, but you may regret it

rude awakening

Related Posts