Scientists made AI agents rougher and improved performance on complex reasoning tasks

Scientists have found that when artificial intelligence (AI) is able to act like human communicators, it reaches more accurate conclusions and becomes more effective debate partners.

Human communication is full of stops and starts, impassioned interruptions, uncertain silences and ambiguity. AI, on the other hand, follows the formal communication style of computers: processing commands, formulating responses, sending output, and patiently waiting for the next command.

“Current multi-agent systems often feel artificial because they lack the messy, real-time dynamics of human conversation,” study co-author Yuichi Kiyo, a professor at the School of Informatics at Tokyo Electro-Communication University, said in a statement. “We wanted to see whether giving agents social cues that we take for granted, such as the ability to interrupt or the choice to remain silent, would improve their collective intelligence.”

unique people

The research team first integrated traits into the LLM according to classical psychology’s “big five” personality types: openness, conscientiousness, extraversion, agreeableness, and neuroticism.

The next step was to reprogram the text-based LLM to process responses sentence by sentence rather than generating a complete response before the next response was initiated. This allowed the researcher to carefully control the flow of the discussion. We also compared results between three conversation settings: fixed speaking order, dynamic speaking order, and dynamic speaking order with interruptions enabled. The latter allows the model to calculate an “urgency score” that allows it to understand and process conversations in real time.

Urgency scores were expressed in several ways in the conversations. If the value spikes because the model discovers a mistake or something that seems important to the discussion, the model will immediately increase the value, regardless of who’s turn it is to speak. If the urgency score was low, the model interpreted this as there was nothing concrete to add, which in itself made the conversation less “cluttered.”

Sei told Live Science that the team used 1,000 questions from the Massive Multitask Language Understanding (MMLU) benchmark to evaluate performance. MMLU is an AI reasoning test that includes questions from a variety of disciplines, including science and humanities.

“When one agent first gave a wrong answer, the overall accuracy was 68.7% for fixed-order discussions, 73.8% for dynamic order, and 79.2% when interruptions were allowed,” Sei said. “In a more difficult setting where two agents initially gave incorrect answers, accuracy was 37.2% with fixed order, 43.7% with dynamic order, and 49.5% with interruptions enabled.”

Having shown that personality-driven models are more accurate than traditional AI chatbots, Sei would like to explore how these new findings can be applied in practice. The research team plans to apply their findings to different domains characterized by creative collaboration to understand how “digital personalities” influence decision-making within groups.

“In the future, AI agents will increasingly interact with each other and with humans in collaborative environments,” Say said. “Our findings suggest that discussions shaped by personality, including the ability to interrupt when necessary, may produce better results than strictly turn-based and uniformly polite interactions.”

Source link

What's Hot

APT28 is related to CVE-2026-21513 MSHTML 0-Day exploited before February 2026 Patch Tuesday

NASA strengthens Artemis mission

Polymer breakthrough could redefine PFAS removal from water

Scientists made AI agents rougher and improved performance on complex reasoning tasks

History of Science: Discovery of Carbon-14 Opens Window on Past Civilizations — February 27, 1940

Giant columns of organic molecules on Mars may be one of the best signs of life ever

This week’s science news: Spider webs on Mars, tigers returned to Kazakhstan, 2,000-year-old skull with permanently blackened teeth

APT28 is related to CVE-2026-21513 MSHTML 0-Day exploited before February 2026 Patch Tuesday

NASA strengthens Artemis mission

Polymer breakthrough could redefine PFAS removal from water

North Korean hackers publish 26 npm packages that hide cross-platform RAT Pastebin C2

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

What's Hot

Scientists made AI agents rougher and improved performance on complex reasoning tasks

unique people

Related Posts