Share
Home Technology China’s DeepSeek launches open-source AI model, rivals OpenAI-o1

China’s DeepSeek launches open-source AI model, rivals OpenAI-o1

DeepSeek used DeepSeek-V3-Base as the base model and employed GRPO as the RL framework to improve model performance in reasoning
China’s DeepSeek launches open-source AI model, rivals OpenAI-o1
The open-source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future

Chinese startup DeepSeek’s new open-source AI model, DeepSeek-R1, overtook rival ChatGPT on Monday, becoming the top-rated free application available on Apple’s App Store in the United States.

Powered by the DeepSeek-V3 model, which its creators say “tops the leaderboard among open-source models and rivals the most advanced closed-source models globally”, the app has surged in popularity among U.S. users since it was released on January 10, according to app data research firm Sensor Tower.

Despite the growing attention around R1, DeepSeek remains largely unknown. Based in Hangzhou, China, it was founded in July 2023 by Liang Wenfeng, a Zhejiang University graduate with a background in information and electronic engineering. It was incubated by High-Flyer, a hedge fund that Wenfeng founded in 2015.

What is DeepSeek-R1?

DeepSeek-R1 is a first-generation reasoning model that achieves performance comparable to OpenAI-o1 across math, code and reasoning tasks. It is trained through large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step.

The AI model is designed to deliver high-performance natural language processing capabilities. It excels in tasks such as text generation, comprehension, and dialogue, leveraging cutting-edge machine learning techniques to provide accurate and contextually relevant responses.

DeepSeek-R1 is optimized for efficiency and scalability, making it suitable for a wide range of applications, from customer support to content creation.

Nvidia’s shares fell on Friday following the launch of DeepSeek-R1 last week, which claimed to be able to outperform rival offerings at a fraction of their cost. NVIDIA shares fell more than 3 percent to $142.62 on Friday, although they still ended higher for the week. On Monday, NVIDIA shares fell 3.12 percent to $142.62.

GRPO to improve reasoning

To develop the model, DeepSeek used DeepSeek-V3-Base as the base model and employed GRPO as the RL framework to improve model performance in reasoning.

During training, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. After thousands of RL steps, DeepSeek-R1-Zero exhibited super performance on reasoning benchmarks.

“We directly apply RL to the base model without relying on supervised fine-tuning (SFT) as a preliminary step. This approach allows the model to explore chain-of-thought (CoT) for solving complex problems, resulting in the development of DeepSeek-R1-Zero,” stated DeepSeek in its latest release.

DeepSeekR1-Zero demonstrates capabilities such as self-verification, reflection and generating long CoTs, marking a significant milestone for the research community.

Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT. This breakthrough paves the way for future advancements in this area.

However, DeepSeek-R1-Zero encountered challenges such as poor readability and language mixing. To address these issues and further enhance reasoning performance, the company introduced DeepSeek-R1, which incorporates a small amount of cold-start data and a multi-stage training pipeline.

Better and smaller models for future growth

The company further explored distillation from DeepSeek-R1 to smaller dense models. Using Qwen2.5- 32B as the base model, direct distillation from DeepSeek-R1 outperformed applying RL on it. This demonstrated that the reasoning patterns discovered by larger base models are crucial for improving reasoning capabilities.

The company also open-sourced the distilled Qwen and Llama series. Its distilled 14B model outperforms state-of-the-art open-source QwQ-32B-Preview by a large margin, and the distilled 32B and 70B models set a new record on the reasoning benchmarks among dense models, on par with OpenAI-o1-mini.

“We introduce our pipeline to develop DeepSeek-R1. The pipeline incorporates two RL stages aimed at discovering improved reasoning patterns and aligning with human preferences, as well as two SFT stages that serve as the seed for the model’s reasoning and non-reasoning capabilities. We believe the pipeline will benefit the industry by creating better models,” the company added.

The company added that it showed how reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models.

The open-source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future.

How to use DeepSeek-R1

Follow the steps below to learn how to use DeepSeek-R1:

  1. To use DeepSeek-R1, click on the following link: chat.deepseek.com
  2. The website will now ask you to sign up.
  3. After creating a new account on the website, you will be directed to the chat.

The website has a similar layout to ChatGPT, making it easy to use for most users.

Read: What you need to know about Google’s Gemini AI updates for Samsung and Pixel

Final thoughts

The DeepSeek-R1 model stands as a transformative force in the field of natural language processing, decision-making, and AI. Its robust architecture and advanced algorithms enable it to tackle complex reasoning tasks with remarkable precision and efficiency, setting a new benchmark for AI-driven solutions.

The model demonstrates exceptional performance on various reasoning tasks and is designed to enhance reasoning performance. By seamlessly integrating into existing workflows, it empowers businesses to streamline operations and improve customer interactions.

The model is licensed under the MIT License and supports commercial use, making it a great choice for businesses and developers. This open licensing model not only fosters innovation but also lowers barriers to entry, enabling startups, enterprises and independent developers to harness its capabilities without restrictions.

The stories on our website are intended for informational purposes only. Those with finance, investment, tax or legal content are not to be taken as financial advice or recommendation. Refer to our full disclaimer policy here.