I’d be dishonest if I claimed that the presentation by NVIDIA and Convai on AI-driven NPCs at the 2024 Consumer Electronics Show didn’t leave an impression on me. After reading about The Verge senior editor Sean Hollister’s firsthand encounter with the demonstration, it became evident that there was a notable technical prowess behind the scenes: the ability of video game AI to comprehend player voice commands and deliver reasonably prompt responses is indeed a noteworthy achievement.
While I couldn’t help but be skeptical of Hollister’s assertion that this advancement is “inevitable,” I found NVIDIA’s showcase less captivating compared to the tech demonstrations showcased for tools like Unreal Engine 5. These demonstrations typically aim to highlight the technology’s strengths while downplaying any potential limitations that developers might face in real-world applications. Additionally, the decision to feature the same Blade Runner-inspired ramen shop a year later raised questions—wouldn’t it be more compelling to showcase the technology in a more diverse range of environments?
Hollister’s observations about the supposedly lifelike NPCs struggling with mundane descriptions of the game world and frequently faltering in response to basic inquiries only added to the skepticism, albeit in a somewhat relatable manner.
In contrast, Unreal Engine’s “Valley of the Ancient” demo presents a visually stunning game world with mechanics that seamlessly fit into a slice of a high-quality game. The casual conversation in the cyberpunk ramen shop pales in comparison to that experience.
Looking beyond the technical intricacies, it dawned on me why NVIDIA’s AI NPC concept failed to resonate with me—it essentially boils down to the fact that a significant investment of resources by the esteemed GPU company is geared towards…emulating the spontaneity of improvisation.
Therefore, for anyone aiming to outshine NVIDIA in this domain, perhaps spending less time immersed in Blade Runner and more time at a local comedy club observing groups with names like “Chewbacca’s Illegal Poker Night” could provide valuable insights. And this sentiment is not just my own—both the gaming and live role-playing communities are already exploring similar avenues.
Marvel’s Spider-Man 2: Leveraging Improv for Memorable NPCs
Recall the launch of Marvel’s Spider-Man 2, where players momentarily paused their web-slinging adventures to eavesdrop on the utterly bizarre conversations among street-level NPCs. In case you missed it, here’s a snippet of the offbeat dialogues that ensued:
Yes, there was indeed an NPC suggesting the use of petroleum jelly on a baby.
In a November article, Wired senior writer Megan Farokhmanesh shed light on how Insomniac Games orchestrated these quirky exchanges: through impromptu comedy from two talented performers. The resulting dialogues, as Farokhmanesh aptly noted, exuded a blend of hilarity and authenticity, seemingly crafted by a content creator with a dark comedic flair.
Voice actor Krizia Bajos shared with Farokhmanesh that she and her scene partner G. K. Bowes were granted creative freedom by dialogue director Patrick Michalak to unleash their improvisational skills during an “atmosphere session” aimed at enriching the game world’s ambient audio. After a few takes on such lines, they swiftly moved on to other dialogues.
Let’s conduct a quick (albeit slightly biased) cost analysis in comparison to NVIDIA’s AI tool. While NVIDIA’s tool necessitates licensing and backend adjustments to generate characters capable of engaging in absurd discussions like the aforementioned example, this content creation process entails multiple rounds of GPU-intensive computations, potentially escalating power expenses along the technology’s pipeline. Achieving such results would likely demand several hours of trial and error, coupled with human supervision and adjustments to fine-tune the output.
Conversely, Bajos and Bowes probably crafted that dialogue within an hour, allowing them ample time to brainstorm lines for other NPCs. Their performances merely require a reasonable compensation and a pleasant catered meal to sustain them.
NVIDIA may argue that their tool offers animations and character creation features for developers, which is a valid point. However, is it truly imperative to render every NPC with utmost realism to captivate players’ attention? Despite my admiration for Marvel’s Spider-Man 2, those street-level New Yorkers exhibit the same lifeless gaze as the action figures adorning my shelf. The spontaneous conversations alone are sufficient to evoke joy among players.
Before I proceed, I must acknowledge that writers and performers can indeed leverage generative AI to craft absurdly humorous dialogues. For instance, the comedy duo Dudesy utilizes an “AI” of the same moniker to concoct silly prompts for their podcast improvisations, yielding outrageous clips of poorly executed celebrity impersonations. However, their attempt at generating an AI-produced George Carlin special without consulting his family diminishes my initial admiration. Consequently, I find myself more inclined to explore Disneyland’s Star Wars-themed Galaxy’s Edge attraction for valuable improv insights, rather than being captivated by NVIDIA’s offerings.
Disney’s Galaxy’s Edge: Embracing Authenticity through Live Performances
While the improv NPCs in Marvel’s Spider-Man 2 offer entertainment, they primarily engage with each other rather than with the player directly.
Venturing into the realm of Batuu at Disneyland and Disney World’s Galaxy’s Edge section transports visitors to a meticulously crafted world, complete with immersive rides and attractions staffed by performers who remain in-character throughout interactions with park guests.
During a pre-pandemic visit in 2020, I encountered two standout performers at Black Spire Outpost: Kylo Ren and Vi Moradi. Kylo Ren, accompanied by a pair of Stormtroopers, once noticed the Rebel Alliance emblem on my jacket, prompting him to confront me menacingly for sporting what he deemed traitorous insignia.
As depicted in the accompanying video, I attempted to engage in banter with him, albeit with limited success due to my overwhelming excitement (and lack of professional acting skills). Nevertheless, the experience was enthralling as Ren proceeded to mock me while his entourage chimed in with supportive remarks. Although I wouldn’t classify it as complete immersion, as such an encounter in the films might have culminated differently, I was undeniably thrilled by the interaction.
There was a sense of validation in the air, as if the world acknowledged my sartorial choices and personal narrative. The performers’ improvisational skills were evident in their ability to adapt to attendees ready to participate, tailoring their performances accordingly.
However, it’s worth noting that Ren and the Stormtroopers weren’t spontaneously crafting tailored lines for each guest. These performers reportedly navigate the park armed with pre-programmed voice prompts triggered by specific gestures, a detail that might elude casual observers.
My subsequent interaction with Vi Moradi assumed a subtler tone. As a lesser-known character without a cinematic counterpart, Moradi primarily assumes the role of a conspicuous spy, engaging attendees—often children—in small-scale missions within the park.
Drawing from my knowledge of Delilah S. Dawson’s novels, where Moradi collaborates with a First Order defector named Captain Cardinal to establish a Resistance outpost on Batuu, I inquired about Cardinal’s status, curious if Disney Imagineering had prepared any special responses for guests familiar with the supplementary material.
“That’s classified,” she curtly replied, avoiding direct eye contact and maintaining her character’s clandestine demeanor. Despite feeling somewhat embarrassed by the exchange, I appreciated the depth of her character portrayal.
From a design perspective, this encounter may seem lackluster on the surface, as the guest didn’t depart feeling particularly enchanted or immersed in the universe. Yet, upon closer inspection, the actor’s response aligns seamlessly with Moradi’s persona—a covert operative who refrains from divulging sensitive information to civilians.
Additionally, such interactions help establish boundaries. Unmasked performers like Moradi encounter a multitude of guests daily, some of whom may attempt to push the boundaries of character interaction. To safeguard both the performer’s well-being and the overall guest experience, setting limitations on certain dialogues becomes essential. This preemptive measure prevents scenarios where guests monopolize the performer’s time with obscure references, thereby hindering interactions with other attendees.
While improv enthusiasts might argue that this approach contradicts the “yes, and” rule of improvisation, I view it as a sophisticated technique. By staying in character and delineating conversational boundaries, performers effectively steer the narrative and ensure a cohesive experience for all attendees—a practice that resonates with game writers tasked with defining character behaviors to maintain narrative focus.
These encounters underscore a crucial element absent in NVIDIA’s generative AI NPC demonstration. Unlike the reactive NPCs featured in NVIDIA’s presentation, the performers at Galaxy’s Edge are proactive, prepared to engage players in diverse scenarios and adept at skillfully navigating interactions. Moreover, they exhibit a level of autonomy that transcends mere compliance with player demands. While NVIDIA’s designers may aspire to showcase similar capabilities with their technology, the current narrative suggests otherwise.
These characters are designed to cater to players’ desires, akin to the automatons depicted in HBO’s Westworld, programmed to respond to human whims. While I remain optimistic about the potential for AI tools to enhance the NPC experience in the future, the existing paradigm falls short. Auto-generated NPCs lack the human touch that initially inspired their creation, representing a fleeting trend that overlooks the essence of human performance.