Researchers at DeepMind, focusing on Artificial General Intelligence (AGI), the next frontier of artificial intelligence, recognized the need to tackle a fundamental issue. They pondered, “What exactly constitutes AGI?”
Broadly speaking, AGI is often conceptualized as an artificial intelligence system that operates akin to the human brain, capable of understanding, learning, and applying knowledge across a wide array of tasks. Expanding on this notion, Wikipedia defines AGI as “a theoretical form of intelligent agent that could learn to carry out any intellectual task that humans or animals can perform.”
As per OpenAI’s directive, AGI encompasses a set of “highly automated techniques that surpass humans in most economically valuable endeavors.”
Gary Marcus, an AI authority and the founder of geometric intelligence, characterizes AGI as “any intelligence that is adaptable and universal, exhibiting ingenuity and dependability comparable to, or exceeding, human intelligence.”
Drawing inspiration from Voltaire’s timeless advice, “If you wish to converse with me, define your terms,” the DeepMind team embraced a simple concept despite the diverse interpretations of AGI.
In a paper published on the preprint server arXiv, the researchers outlined “a framework for categorizing the capabilities and behaviors of AGI models” with the aim of providing a language for researchers to assess strategies, evaluate risks, and track advancements.
Shane Legg, credited with coining the term “AGI” two decades ago, notes that many in the field aspire towards achieving human-level intelligence as a guiding principle.
Highlighting the importance of clarifying definitions to avoid confusion, Grout expressed in an interview with MIT Review, “I observe numerous discussions where individuals appear to attribute different meanings to the term, leading to various dilemmas.”
In their arXiv report titled “Levels of AGI: Operationalizing Progress on the Path to AGI,” the team delineated several prerequisites for an AGI model, focusing more on its attributes than its functionalities.
Emphasizing that achieving AGI does not imply imbuing systems with attributes like consciousness or sentience, the researchers underscored the necessity for an AGI system to learn new concepts and discern when to seek human assistance.
Moreover, they advocate for a focus on potential rather than solely actual deployment scenarios, citing non-technical challenges such as legal, social, and ethical considerations that arise when measuring AGI progress.
The team introduced a hierarchy of intelligence levels, spanning from “Level 0 No AGI” to “Level 5 Superhuman,” with intermediate levels denoting “Emerging,” “Competent,” and “Expert” achievements.
While three AI models, including ChatGPT, Bard, and Llama2, attained the “Level 1 Emerging” status, no other existing AI initiatives currently meet the criteria for AGI.
Among the listed AI programs, SHRDLU, an earlier natural language understanding system developed at MIT, was categorized under “Level 1, Emerging AI.”
AI assistants like Siri, Alexa, and Google Assistant are classified as “Level 2” performers, while Grammarly is recognized as “Level 3, Expert AI” in grammar checking.
Advanced systems like Deep Blue and AlphaGo are positioned at “Level 4 Virtuoso.” Topping the list are StockFish, a robust open-source game strategy, and DeepMind’s AlphaFold, renowned for predicting protein 3D structures based on amino acid sequences.
Notably, the definition of AGI remains fluid and subject to evolution over time.
Meredith Ringel Morris, the principal scientist for human and AI interaction at Google DeepMind, suggests that reevaluating the definition of AGI may become necessary as our understanding of these processes deepens.
The researchers assert that it is impractical to enumerate all tasks within the purview of a truly general intelligence. Hence, an AGI benchmark should revolve around a lifestyle standard, incorporating a mechanism for generating and reaching consensus on novel tasks.