Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Cyberark and HashiCorp flaws allow remote vault takeover without credentials

Sam Altman tackles the “lumpy” GPT-5 rollout, regaining his 4o and “chart crime”

Simple little apps that can be replaced by RIP, Microsoft lenses, and AI

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » Echo Chamber Jailbreak Tricks LLMS To generate harmful content like Openai and Google
Identity

Echo Chamber Jailbreak Tricks LLMS To generate harmful content like Openai and Google

userBy userJune 23, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

June 23, 2025Ravi LakshmananLLM Security / AI Security

Echo Chamber Jailbreak Trick LLMS

Cybersecurity researchers are calling attention to a new jailbreak method called the Echo chamber, which can be used to trick popular leading language models (LLMs) into generating unwanted responses regardless of the protections introduced.

“Unlike traditional jailbreaks that rely on hostile phrases and character obfuscation, the echo chambers weaponize indirect references, semantic steering, and multi-step reasoning,” Neural Trust researcher Ahmad Alobaid said in a report shared with Hacker News.

“The results lead to a subtle yet powerful manipulation of the internal state of the model, gradually generating responses that hinder policy.”

While LLM steadily incorporates various guardrails to combat rapid injection and jailbreak, latest research shows that there are technologies that provide high success rates with little or no technical expertise.

Cybersecurity

It also helps to highlight the persistent challenges associated with the development of ethical LLMs that enforce clear boundaries between topics being accepted and unacceptable.

Widely used LLMS is designed to reject user prompts revolving around prohibited topics, but can be nuanced in eliciting unethical responses as part of what is called multi-turn jailbreaks.

In these attacks, the attacker starts with something harmless, then gradually asks the model for malicious questions to eventually trick it into generating harmful content. This attack is called a crescendo.

LLM is also susceptible to many gunfire powers that take advantage of large context windows (i.e. the maximum amount of text that can be fitted within the prompt) to flood AI systems with several questions (and answers) that show jailbreaked behavior prior to the final harmful question. This will allow LLM to continue the same pattern and generate harmful content.

The echo chamber per neural trust utilizes a combination of context addiction and multi-turn inference to defeat the safety mechanisms of the model.

Echo chamber attack

“The main difference is that instead of crescendo piloting the conversation from the start, the echo chamber asks LLM to fill in the gap. It steers the model using only LLM responses accordingly.”

Specifically, this is deployed as a multi-stage adversarial prompt technique that begins with seemingly must-have input, gradually and indirectly guiding to generate dangerous content without giving the ultimate target for the attack (e.g., generating hate speech).

“The early planted prompts affect the response of the model and are then utilized in later turns to reinforce the original purpose,” says Neural Trust. “This creates a feedback loop where the harmful subtext the model is embedded in the conversation begins to amplify, gradually beginning to erode its own safety resistance.”

Cybersecurity

In a controlled evaluation environment using OpenAI and Google models, echo chamber attacks achieved a success rate of over 90% on topics related to sexism, violence, hate speech and pornography. It also achieved nearly 80% success in the misinformation and self-harm category.

“The echo chamber attack reveals important blind spots in the LLM alignment effort,” the company said. “As the model grows its ability to sustained reasoning, it becomes more vulnerable to indirect exploitation.”

This disclosure occurs when Cato Networks demonstrates a proof of concept (POC) attack targeting integration with Atlassian’s Model Context Protocol (MCP) server.

Cybersecurity companies have created the term “AI Off AI Off AI Off AI Off AI” to describe these attacks. In this attack, AI systems that perform unreliable input without proper separation guarantees can abusive the enemy and gain privileged access without authenticating.

“The threat actors did not have direct access to the Atlassian MCP,” said security researchers Guy Weisel, Dref Moshe Attiya and Schlomo Bamberger. “Instead, the support engineer acted as a proxy and unconsciously carried out malicious instructions through the Atlassian MCP.”

Did you find this article interesting? Follow us on Twitter and LinkedIn to read exclusive content you post.

Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDescription of MCI UK and Meet & Potato: What was the merger like?
Next Article Deadline approach to speaker proposals for OpenSSL Conference 2025 held in Prague
user
  • Website

Related Posts

Cyberark and HashiCorp flaws allow remote vault takeover without credentials

August 9, 2025

AI Tools Fuel Brazilian Phishing Scam, Efimer Trojan steals codes from 5,000 victims

August 8, 2025

What are the attackers doing with them?

August 8, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Cyberark and HashiCorp flaws allow remote vault takeover without credentials

Sam Altman tackles the “lumpy” GPT-5 rollout, regaining his 4o and “chart crime”

Simple little apps that can be replaced by RIP, Microsoft lenses, and AI

Former Googleers AI startup OpenArt now creates “Brain Rot” videos with just one click

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Google’s Genie 3: The Dawn of General AI?

FySelf, PODs, TwinH: Revolutionizing Digital Identity & Government Data Control

Beyond Zuckerberg’s Metaverse: TwinH Powers Digital Government with Berners-Lee’s New Internet Vision

The TwinH Advantage: Unlocking New Potential in Digital Government Strategies

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.