The International Mathematical Olympiad (IMO) stands out as one of the most esteemed competitions for preuniversity students worldwide. Annually, students from across the globe vie for the prestigious bronze, silver, and gold medals (with 112 countries participating in 2023). In a groundbreaking development, AI programs may soon join the ranks of competitors.
Recently, a team spearheaded by Trieu H. Trinh from Google DeepMind and New York University introduced a novel AI system named AlphaGeometry in the renowned journal Nature on January 17. The team revealed that AlphaGeometry effectively tackled 25 out of 30 geometry problems from past IMOs, showcasing a success rate comparable to top human participants who clinched gold medals. Furthermore, the AI unearthed a more comprehensive solution to a 2004 IMO problem that had previously eluded experts.
In their recent publication, Trinh and his collaborators highlighted that “Mathematical Olympiads are the most reputed theorem-proving competitions globally.” During the two-day event, each student tackles six problems spanning various mathematical domains. Some problems are exceptionally intricate, posing challenges even for experts. While these problems often feature concise and elegant solutions, they demand a high degree of creativity. This complexity renders them intriguing from the perspective of AI research, which aims to cultivate systems with creative capabilities. To date, even advanced language models like OpenAI’s GPT-4 have faltered at such tasks.
The primary obstacle hindering the success of AI programs lies in their limited access to data. Language models such as GPT-4 undergo training with vast amounts of text data, equivalent to approximately 20 million letter-sized pages. Converting a proof into a machine-readable programming language like Lean entails substantial effort, particularly in geometry, where formalizing proofs for computational solutions proves arduous.
Trinh and team addressed this challenge by devising a dataset that obviates the need to translate human-generated proofs into a formal language. Initially, an algorithm generated a set of geometric “premises” or foundational elements, like a triangle with marked heights and additional points along its sides. Subsequently, a deductive algorithm inferred additional properties of the triangle, such as angle congruence and perpendicular lines. This methodology, leveraging predefined geometric and algebraic rules, facilitated the creation of a training dataset comprising over 100 million problems and corresponding proofs.
While these methods suffice for standard theorems, they fall short when tackling IMO-level problems that demand the generation of new proof terms. Large language models excel at this aspect, as they can introduce auxiliary objects like points and lines crucial for proofs. By leveraging vast training data, AlphaGeometry, akin to GPT-4, focuses on identifying useful auxiliary objects rather than deducing solutions.
AlphaGeometry’s problem-solving process involves a deductive algorithm deriving properties from the given information. Through intensive training, the AI can introduce new elements like a fourth point X to a triangle ABC to establish specific geometric relationships. This iterative approach, where the AI augments deductive reasoning with new information, culminates in the desired solution. Notably, AlphaGeometry outperformed traditional approaches like Wu’s algorithm and even surpassed human participants in solving IMO geometry problems.
In conclusion, AlphaGeometry’s success underscores the potential of AI in mathematical problem-solving. While its current focus lies on geometry, future applications in diverse mathematical subdisciplines like combinatorics hold promise. The prospect of AI participation in the IMO, and potentially securing a gold medal, hints at exciting advancements on the horizon.