For the past week, social media platforms have been flooded with A.I.-generated videos from OpenAI’s Sora. While many observers have marveled at the intricate and lifelike virtual worlds created by this innovative generative tool, the overall reception seems more subdued compared to the enthusiastic response that greeted DALL-E two years ago.
Similar to DALL-E, Sora utilizes text or image prompts to produce silent videos lasting up to one minute, available in various resolutions. To introduce this tool, OpenAI’s CEO Sam Altman called upon users of X (formerly Twitter) to propose prompts, subsequently sharing the AI-generated video as requested. Some of the showcased examples include a quirky bicycle race among diverse ocean creatures, a tutorial on traditional Tuscan cuisine in a rustic kitchen, and two golden retrievers hosting a podcast atop a mountain.
Sora is currently undergoing safety testing and is not yet accessible to the general public. Nonetheless, filmmakers, animators, and VFX artists are swiftly adapting to this new technology, akin to their counterparts in the realm of still images. While the artistic mastery of renowned directors like Wes Anderson and Tim Burton may not be immediately threatened by a machine lacking charm, sophistication, or attention to detail, specialists involved in crafting fantastical realms may face a different scenario.
The debut of Sora inevitably took center stage in a series of panel discussions on the future of A.I. in film production at the Berlinale over the weekend.
“This is a game-changer,” remarked L.A.-based director Dave Clark in The Hollywood Reporter. “The individuals wielding these tools are the ones to watch out for.” He stressed, however, that traditional storytelling techniques would remain pivotal in crafting films with broad appeal. “How will you captivate your audience after showcasing that 60-second scene of an astronaut gliding through space?”
“A.I. is a topic many shy away from, but it’s a new reality we must embrace,” noted Simon Weisse, a prop designer specializing in miniatures, who highlighted A.I.’s potential applications. “Instead of spending days scouring Google for background images for miniatures, we now rely on ChatGPT.”
The development process behind such an impressive text-to-video generator is poised to attract scrutiny. OpenAI has released a technical report detailing the A.I. models utilized, yet information regarding the training data remains relatively scant. While some data was licensed, and some deemed “publicly available,” specifics on the latter category remain ambiguous.
This lack of transparency aligns with the wave of lawsuits OpenAI is currently facing from artists, authors, and the New York Times over alleged use of copyrighted material as training data. Last year, the company openly acknowledged to the U.K. government that, without a legal exemption granting access to copyrighted data, training today’s leading A.I. models would be unfeasible.
“Did the providers of training data consent to the use of their work?,” questioned Ed Newton-Rex, CEO of Fairly Trained, on X. “The dearth of information from OpenAI on this matter does not instill confidence. Across the A.I. industry, individuals’ work is being exploited without consent to develop products that compete with said work.”
Gary Marcus, a prominent A.I. skeptic, reflected on the evolution of A.I. technology, stating, “When I began working on A.I. four decades ago, the idea of derivative mimicry, siphoning value from artists and creators to corporate entities, never crossed my mind.”
The concerns raised by the release of Sora extend beyond copyright issues. While OpenAI products are programmed to avoid generating violent, hateful, or sexual content, experts in the A.I. field are apprehensive about the technology’s potential for disseminating misinformation, especially in the context of upcoming elections globally. Presently, A.I.-generated videos often exhibit identifiable errors, hindering seamless continuity and preventing causal relationships from being accurately portrayed.
The capability of Sora to animate image prompts in a potentially misleading manner has prompted a cautious approach, with public demonstrations currently withheld, and the eventual release to the public remaining uncertain.
“There is a clear trajectory of improvement in text-to-video technology, inching us closer to a time where discerning between fabricated and authentic content will become increasingly challenging,” warned deepfakes expert Hany Farid in New Scientist. He also cautioned that these eerily silent videos could soon be combined with A.I.-powered voice replication.
While OpenAI may not be hastening the public release of its text-to-video generator, the inevitability of such a launch looms. Following the introduction of DALL-E 2 and ChatGPT in 2022, competitors swiftly emerged, with tech giants like Google and Meta vying to lead the charge this time around.
Startups such as Runway and Pika Labs have already unveiled text-to-video generators, while Stability A.I., the entity behind the popular text-to-image generator Stable Diffusion, has teased a similar tool. Although these tools have yet to match Sora’s crisp, high-definition output or the impressive length of its videos, OpenAI’s latest offering has undoubtedly set a new benchmark.
For the latest updates from the art world, follow Artnet News on Facebook. Stay ahead of the curve by subscribing to our newsletter for breaking news, enlightening interviews, and incisive critiques that drive meaningful conversations forward.