– Google’s Gemini Demo Falls Short of Initial Hype

In a video, Gemini appeared to be responding in real time to what it could see and a human’s …

Google revised an impressive demonstration of Gemini to eliminate human intervention, resulting in an AI with a polished and user-friendly interface.

Gemini appeared to respond in real-time to human voice commands and visual cues from the video content. Google lauded this “multimodality,” highlighting Gemini’s ability to process audio and video data, as a pivotal feature.

A humorous moment unfolded in the video when the AI, while commenting on the uniqueness of violet ducks, was presented with a toy resembling the bird.

Gemini exclaimed, “What the doctor!” in astonishment. It noted the mix-up between a blue bird it was referring to and the toy blue swan, realizing the prevalence of the latter.

However, a Google spokesperson informed Bloomberg that the demonstration “utilized static image frames from the video with text prompts.”

A detailed explanation of this process was provided in a Google blog post published on Wednesday.

Essentially, there was no live interaction or immediate response. Gemini operates similarly to ChatGPT, another competitive bot, relying on text prompts and shared images.

While CEO Sundar Pichai’s post on X garnered over 7 million views, the Gemini demo was widely shared on Google platforms. Yet, the note about reduced latency and concise Gemini results for brevity was only included in the YouTube video description.

The video accurately depicts Gemini’s capabilities and results, condensed for clarity. “The video showcases the potential user experience with Gemini,” mentioned Oriol Vinyals, Google DeepMind’s vice president of research, aiming to inspire engineers.

Since its release, the video has sparked significant interest in Gemini.

One user who reshared the video claimed that Gemini exhibited “broader knowledge than a substantial number of adults,” while another user expressed being awestruck by the video content.

