Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

The Digital Twin Revolution: Reshaping Industry 4.0

Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

1-inch rollout expanded bug bounty features rewards up to $500,000

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » Echo Chamber Jailbreak Tricks LLMS To generate harmful content like Openai and Google
Identity

Echo Chamber Jailbreak Tricks LLMS To generate harmful content like Openai and Google

userBy userJune 23, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

June 23, 2025Ravi LakshmananLLM Security / AI Security

Echo Chamber Jailbreak Trick LLMS

Cybersecurity researchers are calling attention to a new jailbreak method called the Echo chamber, which can be used to trick popular leading language models (LLMs) into generating unwanted responses regardless of the protections introduced.

“Unlike traditional jailbreaks that rely on hostile phrases and character obfuscation, the echo chambers weaponize indirect references, semantic steering, and multi-step reasoning,” Neural Trust researcher Ahmad Alobaid said in a report shared with Hacker News.

“The results lead to a subtle yet powerful manipulation of the internal state of the model, gradually generating responses that hinder policy.”

While LLM steadily incorporates various guardrails to combat rapid injection and jailbreak, latest research shows that there are technologies that provide high success rates with little or no technical expertise.

Cybersecurity

It also helps to highlight the persistent challenges associated with the development of ethical LLMs that enforce clear boundaries between topics being accepted and unacceptable.

Widely used LLMS is designed to reject user prompts revolving around prohibited topics, but can be nuanced in eliciting unethical responses as part of what is called multi-turn jailbreaks.

In these attacks, the attacker starts with something harmless, then gradually asks the model for malicious questions to eventually trick it into generating harmful content. This attack is called a crescendo.

LLM is also susceptible to many gunfire powers that take advantage of large context windows (i.e. the maximum amount of text that can be fitted within the prompt) to flood AI systems with several questions (and answers) that show jailbreaked behavior prior to the final harmful question. This will allow LLM to continue the same pattern and generate harmful content.

The echo chamber per neural trust utilizes a combination of context addiction and multi-turn inference to defeat the safety mechanisms of the model.

Echo chamber attack

“The main difference is that instead of crescendo piloting the conversation from the start, the echo chamber asks LLM to fill in the gap. It steers the model using only LLM responses accordingly.”

Specifically, this is deployed as a multi-stage adversarial prompt technique that begins with seemingly must-have input, gradually and indirectly guiding to generate dangerous content without giving the ultimate target for the attack (e.g., generating hate speech).

“The early planted prompts affect the response of the model and are then utilized in later turns to reinforce the original purpose,” says Neural Trust. “This creates a feedback loop where the harmful subtext the model is embedded in the conversation begins to amplify, gradually beginning to erode its own safety resistance.”

Cybersecurity

In a controlled evaluation environment using OpenAI and Google models, echo chamber attacks achieved a success rate of over 90% on topics related to sexism, violence, hate speech and pornography. It also achieved nearly 80% success in the misinformation and self-harm category.

“The echo chamber attack reveals important blind spots in the LLM alignment effort,” the company said. “As the model grows its ability to sustained reasoning, it becomes more vulnerable to indirect exploitation.”

This disclosure occurs when Cato Networks demonstrates a proof of concept (POC) attack targeting integration with Atlassian’s Model Context Protocol (MCP) server.

Cybersecurity companies have created the term “AI Off AI Off AI Off AI Off AI” to describe these attacks. In this attack, AI systems that perform unreliable input without proper separation guarantees can abusive the enemy and gain privileged access without authenticating.

“The threat actors did not have direct access to the Atlassian MCP,” said security researchers Guy Weisel, Dref Moshe Attiya and Schlomo Bamberger. “Instead, the support engineer acted as a proxy and unconsciously carried out malicious instructions through the Atlassian MCP.”

Did you find this article interesting? Follow us on Twitter and LinkedIn to read exclusive content you post.

Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDescription of MCI UK and Meet & Potato: What was the merger like?
Next Article Deadline approach to speaker proposals for OpenSSL Conference 2025 held in Prague
user
  • Website

Related Posts

The Digital Twin Revolution: Reshaping Industry 4.0

June 23, 2025

DHS warns Proilan hackers who are likely to target US networks after Iran’s nuclear attack

June 23, 2025

XDIGO Malware exploits Windows LNK flaws in Eastern European government attacks

June 23, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

The Digital Twin Revolution: Reshaping Industry 4.0

Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

1-inch rollout expanded bug bounty features rewards up to $500,000

Why Wall Street is actually high after the US bombing Iran

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

The Digital Twin Revolution: Reshaping Industry 4.0

1-inch rollout expanded bug bounty features rewards up to $500,000

PhysicsX raises $135 million to bring AI-first engineering to aerospace, automobiles and energy

Deadline approach to speaker proposals for OpenSSL Conference 2025 held in Prague

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.