Deepseek has gone viral.
After the chatbot apps rose to the top of the Apple App Store chart (and Google Play), China’s AI Lab Deepseek has plunged into mainstream consciousness this week. Deepseek’s AI models were trained using computationally efficient technology, and Wall Street analysts and engineers questioned whether the US could maintain its lead in AI races and whether demand for AI chips would be maintained.
But where did Deepseek come from, and how did it quickly rise to international fame?
The origins of Deepseek traders
Deepseek is supported by Highflyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform trading decisions.
AI enthusiast Liang Wenfeng co-founded High Flyer in 2015. Wenfeng began dabbling in trading while students at Zhijiang University launched Highflyer Capital Management in 2019 and focused on developing and deploying AI algorithms.
In 2023, Highflyer launched DeepSeek as a lab dedicated to researching AI tools separate from the financial business. With a high flyer as one of the investors, the lab ran to its own company, also known as Deepseek.
From day one, Deepseek has built its own data center cluster for model training. But like other AI companies in China, Deepseek is affected by the US export ban on hardware. To train one of the more recent models, the company was forced to use the H100’s Nvidia H800 chip, a less powerful version of the chips available to US companies.
The Deepseek technical team is said to distort young teams. The company is reportedly actively recruiting doctoral researchers from top universities in China. Deepseek also hires people without a computer science background without a computer science background, according to the New York Times, to better understand the technology in a wide range of subjects.
A powerful model from Deepseek
Deepseek announced a set of models for its first model (Deepseek Coder, Deepseek LLM, Deepseek Chat) in November 2023.
The DeepSeek-V2, a general purpose text and image analysis system, worked well on a variety of AI benchmarks and was much cheaper than comparable models of the time. It forced Deepseek’s domestic competition, including bytedance and Alibaba, to lower the usage prices of some models and make others completely free.
The Deepseek-V3, released in December 2024, has been added to Deepseek’s infamous name.
According to DeepSeek’s internal benchmark tests, the DeepSeek V3 outweighs both downloadable and openly available models such as “Meta’s Llama,” which can only be accessed via APIs such as Openai’s GPT-4O.
Equally impressive is Deepseek’s R1 “Inference” model. Released in January, Deepseek claims that the R1 will run Openai’s O1 model on key benchmarks.
R1 is an inference model, so there is effectively a check itself. This helps avoid some of the pitfalls that usually make the model trip. Inference models take a little time to reach the solution compared to regular irrational models. The advantage is that it tends to be more reliable in domains such as physics, science, and mathematics.
However, the other models of the R1, Deepseek V3 and Deepseek have their drawbacks. As they are developed by China, they are subject to benchmarks by Chinese internet regulators, ensuring that their responses “embodies core socialist values.” For example, in Deepseek’s Chatbot app, R1 does not answer questions about Tiananmen Square or Taiwan’s autonomy.
A destructive approach
If DeepSeek has a business model, it is not clear what exactly that model is. The company offers products and services far below market value and offers them free of charge to others.
In the way Deepseek said, efficiency breakthroughs have allowed us to remain competitive at extreme cost. However, some experts have challenged the figures the company has provided.
In any case, the developers are using the DeepSeek model. This is not open source because the phrase is generally understood, but is available under a generous license that allows for commercial use. According to Clem Delangue, CEO of Hugging Face, one of the platforms that host Deepseek’s models, Hugging Face developers have created over 500 “derivatives” models of the R1, combining 2.5 million downloads.
Deepseek’s success against more established rivals is said to be “keep AI” and “overhip.” The company’s success was at least partially responsible for Nvidia’s stock price drop by 18% in January and for drawing public responses from Openai CEO Sam Altman.
Microsoft has announced that DeepSeek is available for Azure AI Foundry Service. This is Microsoft’s platform that brings together AI services for enterprises under a single banner. When asked about Deepseek’s impact on Meta’s AI spending during the first quarter revenue call, CEO Mark Zuckerberg said spending on AI infrastructure will continue to be a “strategic benefit” for Meta.
During Nvidia’s fourth quarter revenue call, CEO Jensen Huang highlighted Deepseek’s “good innovation” and said IT and other “inference” models are great for Nvidia as they require more calculations.
At the same time, some companies have banned Deepseek, as well as countries and governments as a whole, including South Korea. New York has also banned Deepseek from being used on government devices.
It’s not clear how Deepseek’s future will keep it. An improved model is given. However, the US government appears to be wary of what it perceives as a harmful foreign influence. In March, the Wall Street Journal reported that the US is likely to ban deep seeking government devices.
This story was originally published on January 28th, 2025 and will be updated regularly.
Source link