For a significant portion of the past year, challenging OpenAI’s dominance in the tech industry seemed nearly impossible, given the buzz surrounding ChatGPT, a talkative and occasionally eccentric program that fueled excitement and hype for the company.
Demis Hassabis, the CEO of Google DeepMind, has emerged as a strong contender against Sam Altman, spearheading the development of an AI model comparable in capability and innovation to OpenAI’s renowned bot.
Following the merger of two AI-focused divisions within Alphabet to create DeepMind last April, Hassabis has been tasked with organizing its team to counter the rapid ascent of OpenAI and its partnership with Microsoft, which posed a potential threat to Alphabet’s lucrative search business.
While Google researchers contributed ideas to ChatGPT’s development, the company opted not to commercialize them due to concerns regarding potential misconduct or misuse. Recently, Hassabis has overseen a notable acceleration in research and product releases, notably with the swift progress of Gemini, a “multimodal” AI model powering Google’s alternative to ChatGPT and various other products. Just two months after Gemini’s introduction, Google unveiled an enhanced version, Gemini Pro 1.5, capable of analyzing extensive amounts of text, video, and audio simultaneously, offering increased power within a compact size.
Enhancing Alphabet’s premier model, Gemini Ultra, could further propel OpenAI in the race to advance and deliver increasingly potent AI systems.
In a conversation with WIRED senior writer Will Knight conducted over Zoom from his London residence, Hassabis shed light on the advancements in AI technology.
WIRED: Gemini Pro 1.5 boasts the ability to process significantly larger data inputs than its predecessor, thanks to an architecture known as mixture of experts. How significant are these enhancements?
Demis Hassabis: The capacity to process a reasonably sized short film is a notable advancement. This feature could prove immensely beneficial when seeking specific information within lengthy content, such as an hour-long lecture. The potential applications of this capability are vast and promising.
The concept of mixture of experts, pioneered by Google DeepMind’s chief scientist Jeff Dean, has been refined in the new iteration of Gemini. While the performance of this new Pro version has not been extensively tested, it exhibits comparable efficacy to the largest models of the previous generation. There are no constraints preventing the development of an Ultra-sized model incorporating these innovations, a project we are actively pursuing.
In recent years, the escalation in computational power and data utilization for training AI models has been a driving force behind remarkable progress. Sam Altman reportedly aims to secure up to $7 trillion for additional AI chips. Will substantially increased computational power be the key to unlocking artificial general intelligence (AGI)?
There might have been a misinterpretation there; I heard it might have referred to yen or another currency. Undoubtedly, scale plays a crucial role, as evidenced by Nvidia’s valuation. This pursuit of heightened computational resources underscores the importance of scale. However, our approach at Google Research, Brain, and DeepMind has always prioritized fundamental research. Over the past decade, we have pioneered numerous machine learning techniques widely utilized today. Our emphasis on research science distinguishes us from many other organizations. While scale is essential, it is not the sole driver of progress. We anticipate the necessity of multiple innovations to achieve AGI, alongside scaling efforts.
Does this imply that future advancements in AI will not solely hinge on increased computational power?
To attain AGI, several innovations beyond mere scale will likely be imperative. While scaling remains crucial, it is insufficient to unlock new capabilities like planning, tool utilization, or agent-like behavior. Merely scaling existing techniques will not magically confer these novel abilities. Exploring novel avenues in computing is also essential. Ideally, experiments on minor issues that require a few days for training could yield valuable insights. However, solutions effective on a small scale may not seamlessly translate to larger scales. Striking a balance where a modest scale allows for extrapolation by a factor of 10 is crucial.
Could the future competition among AI companies center increasingly on tool utilization and agent-based AI, as purportedly pursued by OpenAI?
It is probable. Our longstanding focus has revolved around agents, reinforcement learning, and planning, particularly since the AlphaGo era. We are revisiting past concepts and contemplating a fusion of AlphaGo capabilities atop these expansive models. Enhanced introspection and planning capabilities could facilitate tasks like hallucination. Encouraging the model to exercise more logical reasoning, as evidenced by phrases like “Take more care” or “Line out your reasoning,” often yields improved performance. This strategic priming fosters a systematic approach to logical reasoning, a trait we aim to embed inherently within the system.
This realm holds immense potential, and we are heavily investing time and resources in this direction. We anticipate a significant leap in system capabilities as they evolve to exhibit more agent-like behaviors. This strategic focus aligns with our vision for the future, and we anticipate similar pursuits from other entities in the field.
Could these advancements potentially render AI models more complex or even hazardous?
I have consistently advocated for cautious progress in safety forums and conferences, emphasizing the substantial leap in capabilities anticipated with the advent of agent-like systems. Unlike the current passive Q&A systems, active learning agents will mark a paradigm shift in AI functionality. While offering enhanced utility by performing tasks efficiently, these systems will necessitate heightened vigilance and precaution.
I have long advocated for the establishment of robust simulation environments to rigorously test agents before their deployment. Such proactive measures are essential to preempt potential risks. Although current systems may not yet possess the potency to pose significant concerns, preparing for the emergence of agent systems is imperative. This transition represents a distinct phase in AI evolution, demanding a collaborative effort across government, industry, and academia to navigate this new landscape effectively.
You previously mentioned the extended testing period for your advanced model, Gemini Ultra. Was this delay primarily due to developmental speed, or did the model present inherent challenges?
The testing duration for larger models like Gemini Ultra is influenced by multiple factors. Firstly, the complexity of fine-tuning processes escalates with model size, elongating the testing phase. Moreover, larger models entail a broader spectrum of capabilities that necessitate meticulous evaluation.
As Google DeepMind consolidates into a unified entity, our approach emphasizes early releases and experimental deployment to a limited user base. Feedback from trusted early adopters informs iterative refinements before widespread dissemination.
Regarding safety considerations, how are engagements with governmental entities like the UK AI Safety Institute progressing?
Discussions with entities like the UK AI Safety Institute have been constructive, although the specifics remain confidential. Collaborative initiatives involve granting access to our cutting-edge models for rigorous testing, with a focus on enhancing safety protocols. A similar initiative is underway in the US, reflecting a concerted effort to bolster safety measures. While current systems may not pose imminent risks, proactively fortifying safety frameworks across various stakeholders is imperative. Anticipating the advent of more advanced agent systems underscores the need for preemptive safety measures, ensuring a secure transition as AI technology advances.