The first open-source model to win a math olympiad gold medal! DeepSeek's new model receives high praise from users: Published technical documents, impressive!
```
DeepSeek's newly released open-source mathematics model is pushing it onto a stage where it can compete with tech giants such as OpenAI and Google. The DeepSeekMath-V2 model has reached gold medal level in what is hailed as the world's toughest high school math competition, becoming the first open-source model to achieve this feat, marking a significant breakthrough for open-source AI in complex reasoning abilities.
Yesterday, DeepSeek announced the launch of its latest mathematical reasoning model, DeepSeekMath-V2, which solved 5 out of 6 problems in the simulated 2025 International Mathematical Olympiad (IMO), reaching gold medal status. This accomplishment makes it the first open-source model to win gold in an IMO-level competition, drawing intense attention from the AI research and developer communities.
This performance directly benchmarks against industry giants. In July this year, Google's DeepMind Gemini advanced version and an experimental reasoning model from OpenAI also reached the IMO 2025 gold medal standard by solving 5 problems, becoming the first AI models to attain that level. However, unlike Google's and OpenAI’s closed-source experimental models, the DeepSeekMath-V2 model weights have been released to the public under the Apache 2.0 license for download.
It is worth noting that DeepSeekMath-V2 adopts an innovative self-verification training framework. The core of this method is to train a dedicated "verifier" whose task is to assess the quality of the proof process rather than simply the correctness of the final answer. Moreover, to prevent the model from overfitting its own checking mechanism, DeepSeek continuously raises the difficulty of the verification process by increasing computing load and automatically labeling proofs that are hard to verify, ensuring that the verifier and generator evolve in sync.
This move is seen as an important step towards AI democratization. The release of this model not only proves that the open-source community is capable of catching up to or even matching top closed-source labs in cutting-edge AI research, but may also rekindle debates over whether open-source models could erode the commercial moat of closed-source products—a topic that has, at times, shaken investors’ confidence in AI giants like Nvidia.
Joining the Top Ranks: Competing with OpenAI and Google
DeepSeekMath-V2's outstanding performance signifies that it is now on equal footing with the world’s leading AI labs in the field of complex mathematical reasoning. The International Mathematical Olympiad (IMO) is widely regarded as the hardest math competition for high school students globally; in the 2025 competition, only 72 of 630 human participants earned a gold medal.
Besides its achievements at IMO 2025, the model has also demonstrated top-tier performance in other high-level math competitions. According to DeepSeek, it also reached gold medal status in China’s premier national competition, the Chinese Mathematical Olympiad (CMO).
In the Putnam Mathematical Competition (Putnam 2024) for undergraduates, the model solved 11 out of 12 problems completely, with only a minor error in the twelfth, ultimately scoring 118/120—surpassing the top human score of 90 points.
An Open-Source Milestone: Community Lauds “Remarkable Release”
Compared with experimental models from Google and OpenAI that are yet to be made public, the core attraction of DeepSeekMath-V2 lies in its complete openness. The model weights have already been released on the open-source community Hugging Face, allowing researchers and developers to freely download them.
Clement Delangue, co-founder and CEO of Hugging Face, praised on social platform X: “Imagine you can have the brain of one of the best mathematicians in the world for free.”
He added, “To my knowledge, there has previously been no chatbot or API that gives you access to an IMO 2025 gold-level model.” He emphasized that users can explore, fine-tune, optimize the model, and run it on their own hardware “with no company or government able to take it back. This is the best embodiment of AI and knowledge democratization.”

Another user, elie, also commented: “Is DeepSeek Math V2 the first open-source model to reach gold at the IMO? Plus we got the technical report—this really is a remarkable release.”

Other users commented that they liked 5-7 ideas, each relatively simple, stacking up continuously, yielding unexpectedly good results, and resembling engineering more than research.

Self-Verification Framework: Beyond Answers, Focusing on Reasoning
DeepSeek pointed out in its technical report that although recent AI models excel at getting correct answers on mathematical benchmarks, they often lack rigorous reasoning processes. The report stated: “Many mathematical tasks such as theorem proving require rigorous step-by-step derivations, not just a numerical answer.”
To address this, DeepSeekMath-V2 uses an innovative self-verification training framework. The core is to train a dedicated ‘verifier’ whose task is to assess the quality of the proof process, not just the final answer’s correctness. This verifier is then used as a reward model to guide an independent “proof generator.” Only when the generator successfully identifies and fixes mistakes in its own proofs does it receive a reward.
This mechanism incentivizes the model to identify and resolve as many issues as possible in its reasoning chain before arriving at a final answer. DeepSeek emphasized, “For open-ended problems with no known solution, self-verification is particularly important in scaling test-time compute.” Test-time compute refers to allocating substantial computing resources during the inference phase, allowing the model more time to reason, explore various solutions, and refine answers.
Dynamic Evolution System: Solving the “Self Overfitting” Challenge
To prevent the model from overfitting its own checking mechanism—that is, merely learning to trick its own verifier—DeepSeek uses a dynamic evolution strategy. The team continuously raises the difficulty of the verification process by increasing compute and automatically tagging proofs that are hard to verify, ensuring that the verifier and generator evolve together.
DeepSeek explained in its technical document that this approach allows them to “scale verification computation to automatically tag new, difficult-to-verify proofs, thereby creating new training data to further improve the verifier.” Through this verification-generation closed loop and meta-verification mechanism, the model achieves fully automated data labeling and continual performance improvements, validating the feasibility of self-driven learning systems in solving complex mathematical reasoning tasks.
Risk Warning and DisclaimerThe market has risks, and investment needs to be cautious. This article does not constitute personal investment advice and does not take into account individual users' special investment goals, financial situation, or needs. Users should consider whether any opinions, views, or conclusions in this article are suitable for their specific situation. Invest accordingly at your own risk. ```