Close Menu
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Spanish
What's Hot

New Orleans is holding burials of African Americans whose skulls were used in racist studies

Two people were killed in Russian attacks on Ukraine before a possible lecture at Turkiye | News of the Russian-Ukraine War

Things AI can do, but laws won’t allow: 16 profitable AI ideas big tech won’t touch (but you can)

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
Facebook X (Twitter) Instagram
Fyself News
  • Academy
  • Events
  • Identity
  • International
  • Inventions
  • Startups
    • Sustainability
  • Tech
  • Spanish
Fyself News
Home » Deepseek’s distillation new R1AI model can be run on a single GPU
Startups

Deepseek’s distillation new R1AI model can be run on a single GPU

userBy userMay 29, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

Deepseek’s updated R1 Reasoning AI model may be attracting attention from the AI ​​community this week. However, the Chinese AI Lab has also released a “distilled” version of the new R1, the DeepSeek-R1-0528-QWEN3-8B. This argues that Deepseek breaks models of comparable sizes on certain benchmarks.

The small updated R1, built using the QWEN3-8B ​​model Alibaba, launched as a foundation in May, is better than Google’s Gemini 2.5 Flash On Aieme 2025, which is better than Google’s Gemini 2.5 Flash On Aieme 2025.

The DeepSeek-R1-0528-QWEN3-8B ​​is roughly in line with Microsoft’s recently released Phi 4 Reasoning Plus model, another mathematical skill test, HMMT.

So-called distillation models, such as DeepSeek-R1-0528-QWEN3-8B, are generally less capable than their full-size counterparts. On the positive side, they are much less computationally demanding. According to cloud platform Nodeshift, QWEN3-8B ​​requires a GPU with 40GB-80GB of RAM to run (for example, the NVIDIA H100). The new full-size R1 requires about a dozen 80GB GPUs.

DeepSeek trained DeepSeek-R1-0528-QWEN3-8B ​​by getting the text generated by the updated R1 and using it to fine-tune QWEN3-8B. On a dedicated web page for the AI ​​DEV platform face-hugging model, Deepseek describes Deepseek-R1-0528-QWen3-8B as “for both academic research on inference models and industrial development focusing on small-scale models.”

DeepSeek-R1-0528-QWEN3-8B ​​is available under an acceptable MIT license. This means that it can be used commercially without restrictions. Several hosts, including LM Studio, already offer models via APIs.


Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleWill Harvey Weinstein take his position in his own defense case in a retrial of a sex crime?
Next Article Kabila, former DRC president, will hold consultations in the city of Goma owned by M23: Report | Dispute News
user
  • Website

Related Posts

Space Forge raises $30 million Series A to make chip materials in space

May 31, 2025

Google quietly released an app that allows you to download and run AI models locally

May 31, 2025

A guide to using editing, Meta’s new Capcut Rival for Short-Form video editing

May 31, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

New Orleans is holding burials of African Americans whose skulls were used in racist studies

Two people were killed in Russian attacks on Ukraine before a possible lecture at Turkiye | News of the Russian-Ukraine War

Things AI can do, but laws won’t allow: 16 profitable AI ideas big tech won’t touch (but you can)

California’s Track and Field Finals begin to follow the controversy over trans athletes

Trending Posts

Two people were killed in Russian attacks on Ukraine before a possible lecture at Turkiye | News of the Russian-Ukraine War

May 31, 2025

Saudi Arabia says it will fund Syrian salaries in jointly with Qatar | Syrian War News

May 31, 2025

India’s top general admits “loss” in the air in recent conflict with Pakistan | India and Pakistan tension news

May 31, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Please enable JavaScript in your browser to complete this form.
Loading

Welcome to Fyself News, your go-to platform for the latest in tech, startups, inventions, sustainability, and fintech! We are a passionate team of enthusiasts committed to bringing you timely, insightful, and accurate information on the most pressing developments across these industries. Whether you’re an entrepreneur, investor, or just someone curious about the future of technology and innovation, Fyself News has something for you.

Things AI can do, but laws won’t allow: 16 profitable AI ideas big tech won’t touch (but you can)

Top Startup and Tech Funding News for the Week Ending May 30, 2025

Where LLMS retrieves real-time data behind AI searches (and why it’s more important than you think)

SpaceX’s Journey to Mars: How Spaceships Use Hohmann Orbital’s Movement from Earth to Mars (and the Physics Behind)

Facebook X (Twitter) Instagram Pinterest YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • User-Submitted Posts
© 2025 news.fyself. Designed by by fyself.

Type above and press Enter to search. Press Esc to cancel.