Large models begin "cracking" math problems in bulk

```

Breakthroughs in artificial intelligence within the field of mathematics are accelerating. Since Christmas, 15 out of over 1,000 unsolved problems left by renowned mathematician Paul Erdős have gone from "unsolved" to "solved" status, among which 11 explicitly marked AI model involvement in the solving process. This progress signifies that large language models are demonstrating unprecedented capabilities in advancing the frontiers of human knowledge.

According to a TechCrunch report on Thursday, OpenAI's newly released GPT 5.2 model has achieved significant improvements in mathematical reasoning abilities. Software engineer and former quantitative researcher Neel Somani found in testing that the model could provide a complete mathematical proof within 15 minutes, verified error-free by the formalization tool Harmonic. This performance far surpasses previous versions, ushering AI tools into a new stage of independently conquering difficult mathematical problems rather than just playing an auxiliary role.

Fields Medalist Terence Tao counted on his GitHub page that AI models have made substantive independent progress on 8 different Erdős problems, with another 6 cases making breakthroughs by locating and extending previous research. Although a fully autonomous mathematical research process has yet to be reached, the role of large models in mathematics can no longer be ignored.

This progress is having a direct impact on both the mathematics research ecosystem and the AI application market. Formalization tools such as Microsoft's open-source proof assistant Lean and AI tools like Harmonic's Aristotle are being widely adopted by top mathematicians and computer science professors, foreshadowing profound changes in academic research workflows.

From Serendipitous Discovery to Systematic Breakthroughs

Somani's discovery began with a routine test. He input a mathematical problem into ChatGPT, and after letting the model think for 15 minutes, it returned a complete solution. The proof cited mathematical axioms such as Legendre’s formula, the Bertrand postulate, and the Star of David theorem, and ultimately found an elegant solution similar to one provided by Harvard mathematician Noam Elkies on the Math Overflow forum in 2013. However, ChatGPT's final proof differed in key aspects from Elkies' work and gave a more complete answer to a certain version of the Erdős problem.

"I wanted to establish a benchmark to understand when large language models can effectively solve open mathematical problems and where they still face difficulties," Somani stated. Surprisingly, with the latest model, this cutting-edge boundary began to advance.

The Erdős problem set contains over 1,000 conjectures, proposed and maintained online by the Hungarian mathematician. These problems vary significantly in topic and difficulty and have become tantalizing targets for AI-driven mathematical research. The first autonomous solutions emerged last November from the Gemini-driven AlphaEvolve model, but recently GPT 5.2 has performed even better in advanced mathematics. Somani describes it as "more adept at mathematical reasoning than previous versions."

Careful Assessments by Leading Mathematicians

Terence Tao takes a more nuanced view of this progress. He speculates on Mastodon that the scalability of AI systems makes them "more suitable for systematically tackling the 'long tail' of less-known Erdős problems, many of which actually have direct solutions."

"Therefore, many of the simpler Erdős problems are now more likely to be solved by pure AI methods, rather than by humans or hybrid approaches," Tao added.

This assessment reveals AI's role in mathematical research: not to replace human mathematicians in tackling the most cutting-edge complex problems, but to efficiently handle a large number of medium-difficulty problems that have long been overlooked due to limited human resources. This division of labor may reshape the allocation of resources in mathematical research.

Formalization Tools Drive Real-World Applications

Another key driving factor is the recent transition toward formalization in the mathematics community. Formalization is a labor-intensive task that makes mathematical reasoning easier to verify and extend. Although formalization does not necessarily depend on AI or computers, the new generation of automated tools has greatly reduced the difficulty of the work.

Microsoft Research's open-source proof assistant "Lean", developed in 2013, is now widely used in the field, while Harmonic's AI tool Aristotle promises to automate most of the formalization work.

Harmonic founder Tudor Achim believes that the sudden increase in the number of solved Erdős problems is not as important as top mathematicians starting to take these tools seriously. "I'm more concerned that math and computer science professors are using these AI tools," Achim said. "These people have reputations to protect, so when they say they use Aristotle or ChatGPT, that's real evidence."

This trend indicates that AI tools have moved from the experimental stage into mainstream academic applications, potentially opening up new business opportunities for related technology companies, while also posing challenges to traditional methodologies in mathematical research.

Risk Warning and DisclaimerThe market involves risk, and investment requires caution. This article does not constitute personal investment advice, nor does it take into account the special investment objectives, financial situation, or needs of individual users. Users should consider whether any opinion, viewpoint, or conclusion in this article fits their specific circumstances. Investments based on this article are at your own risk.

```