open source

DeepSeek-R1-Zero: …

In the rapidly evolving field of artificial intelligence, large language models (LLMs) have emerged as powerful tools for various applications. One of the most exciting developments in this area is the DeepSeek-R1-Zero model, which leverages reinforcement learning (RL) to enhance reasoning …

Rejection Sampling in …

Understanding Rejection Sampling: From Basics to DeepSeek-R1 When dealing with complex probability distributions, directly sampling from them can be challenging. Rejection sampling is a clever statistical technique that allows us to overcome this hurdle. This method is not only foundational in …