
Tbaer
Add a review FollowOverview
-
Sectors Project Managers
-
Posted Jobs 0
-
Viewed 18
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 throughout mathematics, code, and thinking tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek team has shown that the thinking patterns of larger models can be distilled into smaller models, resulting in much better efficiency compared to the reasoning patterns found through RL on little models.
Below are the models produced through fine-tuning versus a number of thick designs widely utilized in the research study community using reasoning data created by DeepSeek-R1. The assessment results show that the distilled smaller sized thick designs perform exceptionally well on .
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The model weights are accredited under the MIT License. DeepSeek-R1 series assistance business use, enable any modifications and acquired works, including, however not limited to, distillation for training other LLMs.