An AI developed by Google DeepMind exhibits remarkable proficiency in tackling geometry problems from the International Mathematical Olympiad (IMO), rivaling the performance of top human participants.
Gregor Dolinar, the President of IMO, lauds the achievements of AlphaGeometry, expressing astonishment at its capabilities and predicting an imminent victory for AI in claiming the prestigious IMO gold medal.
The IMO, designed for high school students, stands out as one of the most challenging math competitions globally, demanding a level of mathematical ingenuity that AI models traditionally struggle to match. While AI systems like GPT-4 excel in various domains, they falter when faced with IMO geometry questions, unlike specialized AIs that still lag behind average contestants in performance.
The complexity of the problems poses a significant hurdle, compounded by the scarcity of training data. With only six questions per annual edition since 1959, the competition falls short in providing the extensive datasets required by many AI models. Geometrical questions, in particular, involving intricate proofs about angles and lines in complex shapes, present a formidable challenge in translating to a machine-friendly format.
To overcome this data limitation, Thang Luong and his team at Google DeepMind devised a tool capable of generating hundreds of millions of machine-readable geometrical proofs. By training AlphaGeometry on this vast dataset and testing it on 30 IMO geometry questions, the AI successfully answered 25 questions correctly, closely matching the expected score of an IMO gold medalist.
Luong emphasizes the ongoing struggle of AI systems in tasks requiring deep reasoning and comprehensive planning, underscoring the significance of mathematics as a benchmark for progress towards artificial general intelligence.
AlphaGeometry comprises two components akin to distinct thinking systems in the brain: a rapid, intuitive system (GPT-f) and a slower, analytical system for symbolic reasoning. The collaboration between these systems, akin to ChatGPT technology, facilitates problem-solving by suggesting theorems and arguments swiftly and meticulously constructing proofs.
While AlphaGeometry demonstrates prowess in solving IMO geometry problems, its solutions tend to be lengthier and less elegant than human proofs. Nonetheless, the AI can uncover alternative solutions overlooked by humans, showcasing its potential for mathematical innovation.
Despite its success, AlphaGeometry’s mathematical scope remains constrained to undergraduate-level theorems, prompting discussions on expanding its mathematical knowledge to enhance performance and potentially unveil new mathematical insights.
Looking ahead, the AI faces the challenge of exploring uncharted mathematical territories without predefined endpoints, a task that often sparks novel discoveries and mathematical breakthroughs.
XTX Markets’ $10 million AIMO challenge fund incentivizes the development of AI mathematical models, offering substantial rewards for achieving milestones like solving an IMO geometry problem and ultimately clinching an IMO gold medal.
While DeepMind has not disclosed specific plans to enter AlphaGeometry in live IMO contests or extend its capabilities to non-geometric IMO problems, the company’s track record in public competitions, such as the AlphaFold system’s success in protein folding prediction challenges, hints at potential future endeavors in the mathematical domain.