Close Menu
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Spanish
What's Hot

Meta Earth Network 2.0: Pioneering Web3 Innovation with Rewards and Global Events

The more sustainability and transparency you get, the better your decisions will be

Successful In-house SOC 6 steps up to 24 hours a day, 365 days a year

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Spanish
Fyself News
Home » deepseek launches flashmla: nvidia gpus AI speed and efficiency breakthrough
Tech

deepseek launches flashmla: nvidia gpus AI speed and efficiency breakthrough

userBy userFebruary 24, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Following the success of the R1 model, Chinese AI startup DeepSeek unveiled Flashmla on Monday. This is an open source, multi-head latent attention (MLA) decoding kernel optimized for Nvidia’s hopper GPUs. Flashmla is a highly efficient translator and considers it to be both a turbo boost for AI models, responding faster in conversations, improving everything from chatbots to voice assistants and AI-driven search tools It will help you.

This release is part of Deepseek’s Open Source Week, highlighting efforts to improve AI performance and accessibility through community-driven innovation.

In X’s post, Deepseek said,

“It’s an honor to share Flashmla. It’s an efficient MLA decoding kernel for hopper GPUs, optimized for variable length sequences and is currently in production.”

##Opensourceweek Day 1: flashmla

I am honored to share Flashmla – an efficient MLA decoding kernel for Hopper GPUs, optimized for variable length sequences and is currently in production.

bf16 support
paged KV cache page (block size 64)
⚡3000 gb/s memory bound & 580 tflops…

– Deepseek (@deepseek_ai) February 24, 2025

Why Flashmla is a big deal

Flashmla is designed to maximize AI efficiency. It supports BF16 Precision, uses a 64-block-sized page KV cache, and offers the highest tier performance with 3000 GB/s memory bandwidth and 580 TFLOPS on an H800 GPU.

The real magic is how to handle variable length sequences. This significantly reduces computational load while speeding up AI performance. This has attracted the attention of AI developers and researchers.

Flashmla’s main features:

High Performance: FlashMLA leverages CUDA 12.6 to achieve up to 3000 GB/s of memory bandwidth and 580 TFLOPS calculation throughput on an H800 SXM5 GPU.

Optimized for variable length sequences. It is designed to efficiently handle variable-length sequences and enhance the decoding process of AI applications.

BF16 support and page KV caching: BF16 precision and 64 block size page key value cache is included, reducing memory overhead during large model inference.

How to improve AI performance

🚀Fast response
AI models typically process information before generating a reply. Flashmla makes this process much faster and improves response times, especially for long conversations.

Handle conversations extended with lag
Conversation history (kv cache) in AI chatbots. Flashmla optimizes this and tracks the discussion without AI slowing down or overloading the hardware.

Optimized for high end AI systems
Built for Nvidia’s Hopper series GPUs, Flashmla runs at peak efficiency on advanced AI hardware, making it the perfect solution for large-scale applications.

Why is it important?

Flashmla is open source, so AI developers can use it for free and refine and build on its capabilities. This means faster, smarter AI tools when it comes to chatbots, translation software, or AI-generated content.

Real life examples

Imagine this: you are chatting with a customer service bot. Without Flashmla there is a prominent pause before each response. With Flashmla, replies come instantly and make your conversation feel seamless. Most of the time it’s like talking to real people.

Ultimately, Deepseek’s push for open-source AI innovation paves the way for even greater advancement, potentially providing developers with tools to push AI performance to new heights.


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMIA: Growth forecast in 2025 due to UK business events despite employment challenges
Next Article Researchers accus North Korea of ​​$1.4 billion of Bibit Crypto robbers
user
  • Website

Related Posts

Meta Earth Network 2.0: Pioneering Web3 Innovation with Rewards and Global Events

June 20, 2025

Top 10 Startups and High-Tech Funding News – June 19, 2025

June 19, 2025

Sifflet raises $18 million to power AI using reliable data as a demand for observability

June 19, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

Meta Earth Network 2.0: Pioneering Web3 Innovation with Rewards and Global Events

The more sustainability and transparency you get, the better your decisions will be

Successful In-house SOC 6 steps up to 24 hours a day, 365 days a year

A massive 7.3 TBPS DDOS attack targets hosting providers and delivers 37.4 TB in 45 seconds

Trending Posts

Sana Yousaf, who was the Pakistani Tiktok star shot by gunmen? |Crime News

June 4, 2025

Trump says it’s difficult to make a deal with China’s xi’ amid trade disputes | Donald Trump News

June 4, 2025

Iraq’s Jewish Community Saves Forgotten Shrine Religious News

June 4, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Meta Earth Network 2.0: Pioneering Web3 Innovation with Rewards and Global Events

Top 10 Startups and High-Tech Funding News – June 19, 2025

Sifflet raises $18 million to power AI using reliable data as a demand for observability

Is WhatsApp becoming a weapon of war?

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.