Close Menu
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
What's Hot

Firewall Exploits, AI Data Theft, Android Hacks, APT Attacks, Insider Leaks & More

How to browse the web more sustainably with a green browser

Japan joins groundbreaking research partnership with Horizon Europe

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Home
  • Identity
  • Inventions
  • Future
  • Science
  • Startups
  • Spanish
Fyself News
Home » deepseek launches flashmla: nvidia gpus AI speed and efficiency breakthrough
Tech

deepseek launches flashmla: nvidia gpus AI speed and efficiency breakthrough

userBy userFebruary 24, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Following the success of the R1 model, Chinese AI startup DeepSeek unveiled Flashmla on Monday. This is an open source, multi-head latent attention (MLA) decoding kernel optimized for Nvidia’s hopper GPUs. Flashmla is a highly efficient translator and considers it to be both a turbo boost for AI models, responding faster in conversations, improving everything from chatbots to voice assistants and AI-driven search tools It will help you.

This release is part of Deepseek’s Open Source Week, highlighting efforts to improve AI performance and accessibility through community-driven innovation.

In X’s post, Deepseek said,

“It’s an honor to share Flashmla. It’s an efficient MLA decoding kernel for hopper GPUs, optimized for variable length sequences and is currently in production.”

##Opensourceweek Day 1: flashmla

I am honored to share Flashmla – an efficient MLA decoding kernel for Hopper GPUs, optimized for variable length sequences and is currently in production.

bf16 support
paged KV cache page (block size 64)
⚡3000 gb/s memory bound & 580 tflops…

– Deepseek (@deepseek_ai) February 24, 2025

Why Flashmla is a big deal

Flashmla is designed to maximize AI efficiency. It supports BF16 Precision, uses a 64-block-sized page KV cache, and offers the highest tier performance with 3000 GB/s memory bandwidth and 580 TFLOPS on an H800 GPU.

The real magic is how to handle variable length sequences. This significantly reduces computational load while speeding up AI performance. This has attracted the attention of AI developers and researchers.

Flashmla’s main features:

High Performance: FlashMLA leverages CUDA 12.6 to achieve up to 3000 GB/s of memory bandwidth and 580 TFLOPS calculation throughput on an H800 SXM5 GPU.

Optimized for variable length sequences. It is designed to efficiently handle variable-length sequences and enhance the decoding process of AI applications.

BF16 support and page KV caching: BF16 precision and 64 block size page key value cache is included, reducing memory overhead during large model inference.

How to improve AI performance

🚀Fast response
AI models typically process information before generating a reply. Flashmla makes this process much faster and improves response times, especially for long conversations.

Handle conversations extended with lag
Conversation history (kv cache) in AI chatbots. Flashmla optimizes this and tracks the discussion without AI slowing down or overloading the hardware.

Optimized for high end AI systems
Built for Nvidia’s Hopper series GPUs, Flashmla runs at peak efficiency on advanced AI hardware, making it the perfect solution for large-scale applications.

Why is it important?

Flashmla is open source, so AI developers can use it for free and refine and build on its capabilities. This means faster, smarter AI tools when it comes to chatbots, translation software, or AI-generated content.

Real life examples

Imagine this: you are chatting with a customer service bot. Without Flashmla there is a prominent pause before each response. With Flashmla, replies come instantly and make your conversation feel seamless. Most of the time it’s like talking to real people.

Ultimately, Deepseek’s push for open-source AI innovation paves the way for even greater advancement, potentially providing developers with tools to push AI performance to new heights.


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMIA: Growth forecast in 2025 due to UK business events despite employment challenges
Next Article Researchers accus North Korea of ​​$1.4 billion of Bibit Crypto robbers
user
  • Website

Related Posts

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

December 10, 2025

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

November 26, 2025

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

November 25, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Firewall Exploits, AI Data Theft, Android Hacks, APT Attacks, Insider Leaks & More

How to browse the web more sustainably with a green browser

Japan joins groundbreaking research partnership with Horizon Europe

UK government announces biggest animal welfare reforms in history

Trending Posts

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Castilla-La Mancha Ignites Innovation: fiveclmsummit Redefines Tech Future

Local Power, Health Innovation: Alcolea de Calatrava Boosts FiveCLM PoC with Community Engagement

The Future of Digital Twins in Healthcare: From Virtual Replicas to Personalized Medical Models

Human Digital Twins: The Next Tech Frontier Set to Transform Healthcare and Beyond

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.