In a groundbreaking achievement, two artificial intelligence systems developed by Google DeepMind have successfully solved four out of six problems from this year’s International Mathematical Olympiad, matching the performance of silver medalists in the prestigious competition for high school students.
This remarkable feat demonstrates a significant leap forward in AI’s ability to tackle complex mathematical challenges, an area that has long been considered a formidable hurdle in machine learning.
The AI systems, named AlphaProof and AlphaGeometry 2, showcased their prowess by solving problems that typically require advanced mathematical reasoning and creativity.
AlphaProof, utilizing reinforcement learning techniques, managed to crack three problems, including two in algebra and one in number theory. Meanwhile, AlphaGeometry 2 swiftly solved the geometry problem in a mere 19 seconds.
Pushmeet Kohli, vice president of research at DeepMind, emphasized the significance of this accomplishment, stating, “These are extremely hard mathematical problems and no AI system has ever achieved a high success rate in these types of problems.”
The AI’s performance is particularly impressive given the limited availability of training data for math-focused models, prompting the DeepMind team to employ synthetic data generated by AI itself.
The development process involved fine-tuning Google’s Gemini model to translate mathematical problem statements into a programming language called Lean.
AlphaProof then generated potential solutions, checking them against possible proof steps and improving its performance through iterative learning.
Fields medalist Timothy Gowers, who served as one of the judges evaluating the AI’s work, expressed surprise at the systems’ ability to devise clever problem-solving strategies. “I find it very impressive and a significant jump from what was previously possible,” Gowers remarked, though he cautioned that further research is needed to fully understand the AI’s problem-solving methods.
While the AI systems demonstrated remarkable capabilities, scoring 28 out of 42 possible points and narrowly missing the gold-medal threshold, it’s important to note that they are not yet contributing new mathematical knowledge.
David Silver, DeepMind’s vice president of reinforcement learning, clarified that the current achievement represents the ability to solve challenging problems rather than tackling open research questions.
This breakthrough in AI’s mathematical problem-solving abilities opens up exciting possibilities for the future of both artificial intelligence and mathematics.
As these systems continue to evolve, they may eventually assist human mathematicians in exploring new frontiers of mathematical knowledge and potentially revolutionize how complex problems are approached across various scientific disciplines.