Tencent Holdings, a prominent Chinese internet giant, has introduced an artificial intelligence (AI) model for converting images to videos in collaboration with academic partners. This release coincides with the growing interest in content-generating tools such as OpenAI’s ChatGPT and Sora.
The newly unveiled image-to-video AI model allows users to interact with a photo by selecting specific areas and providing simple commands for movement. The image-animation tool, named Follow-Your-Click, hosted on Microsoft’s GitHub platform, can transform a static image into a brief animated clip.
This innovative project is a result of a partnership involving Tencent’s Hunyuan group, the Hong Kong University of Science and Technology, and Tsinghua University, a prestigious institution in Beijing, China.
In a related development, OpenAI’s Sora has attracted attention by showcasing its capabilities in the AI landscape. The model, although initially set for a full release in April, now offers a test version on GitHub. Researchers have demonstrated its functionality, showcasing how a still image of a bird, when prompted to “flap the wings,” transforms into a colorful bird fluttering its wings in a short MP4 clip. Similarly, a picture of a woman with the command “wind” results in an animation featuring lightning in the background.
The collaborative team behind Observe-Your-Click aims to address limitations observed in existing image-to-video models that often prioritize entire scenes over specific elements within an image. By simplifying user commands and enhancing generation performance, the model offers more precise control over the animation process.
The researchers, in a paper published on arXiv, highlighted the user-friendly interface and improved generation quality of their framework compared to previous methods.
As the competition heats up in the AI industry, particularly in video generation technology, companies like Silicon Valley’s Pika Labs, founded by Chinese PhD candidate Guo Wenjing from Stanford University, are making significant strides. Tencent’s counterparts in China, including Alibaba Group Holding, are also actively participating in this technological race. Alibaba recently introduced a video-generation tool called EMO, capable of transforming audio and video cues into singing and syllable videos.