Written by 10:40 am AI Trend, Generative AI, OpenAI

### Transforming Text into Videos: Witness OpenAI’s Sora Reverse the Norms by Running on a Treadmill in the Opposite Direction

OpenAI introduces Sora, its innovative text-to-video model

OpenAI has unveiled its latest video generation model, Sora, capable of producing lifelike AI videos solely based on text input and directives. During a recent conversation with Bill Gates, Sam Altman, the CEO of OpenAI, discussed the future prospects of ChatGPT, envisioning its ability to create videos from text. This aspiration has materialized in the form of Sora, the text-to-video AI model capable of crafting videos up to a minute in duration while upholding visual quality and fidelity to the user’s instructions, as asserted by the OpenAI team.openai sora text to video ai
Images and videos provided by OpenAI

OpenAI has shared a series of demonstrations showcasing the capabilities of its new text-to-video model, Sora. Detailed text prompts are essential for the generated videos to accurately depict the desired visuals. Currently, Sora can interpret elaborate instructions like ‘The camera rotates around a large stack of vintage televisions all showing different programs — 1950s sci-fi movies, horror movies, news, static, a 1970s sitcom, etc, set inside a large New York museum gallery.’ openai sora text to video ai
Prompt: a movie trailer featuring the adventures of the 30-year-old spaceman donning a red wool knitted motorcycle helmet, set against a blue sky and salt desert, shot in cinematic style on 35mm film with vivid colors.

Moreover, OpenAI has experimented with prompts such as ‘A close-up view of a glass sphere that contains a zen garden within, featuring a small dwarf raking the zen garden and creating patterns in the sand’ and ‘A video depicting a Chinese Lunar New Year celebration with a Chinese Dragon.’ Sora has successfully translated these prompts into brief yet realistic video clips. OpenAI explains that Sora leverages a transformer architecture akin to its GPT models, enhancing the performance and quality of the generated videos.

openai sora text to video ai
Prompt: a litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in

In addition to generating videos from text descriptions, OpenAI’s Sora has the ability to animate static images and even extrapolate or fill in missing frames in existing videos. The model can create entire videos in one go or extend the duration of generated clips. According to OpenAI, ‘Sora operates as a diffusion model, initiating video generation from noise-like static and progressively refining it by eliminating noise across multiple steps.’

Prompt: a step-printing scene depicting a person running, shot in cinematic 35mm film.

Limitations of OpenAI’s Sora

Despite its advancements, the text-to-video model Sora exhibits certain limitations. OpenAI acknowledges these shortcomings, noting that Sora may struggle with grasping the physics of a scene or understanding causal relationships. For instance, discrepancies like a person consuming a cookie without leaving a bite mark on it may arise. OpenAI further illustrates instances where Sora confuses left and right directions, evident in an AI-generated video of a man running in the opposite direction on a treadmill.

Prompt: the camera rotates around a large stack of vintage televisions all showing different programs — 1950s sci-fi movies, horror movies, news, static, a 1970s sitcom, etc, set inside a large New York museum gallery.

Another anomaly observed in Sora-generated videos is the spontaneous appearance of unmentioned objects, such as animals or individuals, and unexpected events like a basketball setting the hoop’s net on fire, followed by the sudden appearance of a new basketball descending from the sky and passing through the hoop. Furthermore, challenges with camera movements may result in shaky or unstable video outputs.

Prompt: a Chinese Lunar New Year celebration video featuring a Chinese Dragon

As of the time of reporting, OpenAI has restricted access to its text-to-video model Sora to a select group of visual artists, designers, and filmmakers, aiming to gather feedback on enhancing the model for creative professionals. While enthusiasts eagerly anticipate utilizing the AI model, concerns regarding potential risks associated with this generative technology have also surfaced.

Visited 2 times, 1 visit(s) today
Tags: , , Last modified: February 26, 2024
Close Search Window
Close