Google has a lot to prove with its AI efforts — but it can’t seem to stop tripping over its own feet.
Earlier this week, the tech giant announced Gemini, its most capable AI model to date, to much fanfare. In one of a series of videos, Google showed off the mid-level range of the model dubbed Gemini Pro by demonstrating how it could recognize a series of illustrations of a duck, describing the changes a drawing went through at a conversational pace.
But there’s one big problem, as Bloomberg columnist Parmy Olson points out: Google appears to have faked the whole thing.
In its own description of the video, Google admitted that “for the purposes of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.” The video footage itself is also appended with the phrase “sequences shortened throughout.”
In other words, Google misrepresented the speed at which Gemini Pro can recognize a series of images, indicating that we still don’t know what the model is actually capable of.
In the video, Gemini wowed observers by using its multimodal thinking chops to recognize illustrations at what appears to be a drop of a hat. The video, as Olson suggests, also offered us “glimmers of the reasoning abilities that Google’s DeepMind AI lab have cultivated over the years.”
That’s indeed impressive, considering any form of reasoning has quickly become the next holy grail in the AI industry, causing intense interest in models like OpenAI’s rumored Q*.
In reality, the demo wasn’t just significantly sped up to make it seem more impressive, but Gemini Pro is likely still stuck with the same old capabilities that we’ve already seen many times before.
“I think these capabilities are not as novel as people think,” Wharton professor Ethan Mollick tweeted, showing how ChatGPT was effortlessly able to identify the simple drawings of a duck in a series of screenshots.
Did Google actively try to deceive the public by speeding up the footage? In a statement to Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text.”
In other words, Gemini was likely given plenty of time to analyze the images. And its output may have then been overlaid over video footage, giving the impression that it was much more capable than it really was.
“The video illustrates what the multimode user experiences built with Gemini could look like,” Oriol Vinyals, vice president of research and deep learning lead at Google’s DeepMind, wrote in a post on X.
Emphasis on “could.” Perhaps Google should’ve opted to show the actual capabilities of its Gemini AI instead.
It’s not even the first time Google has royally screwed up the launch of an AI model. Earlier this year, when the company announced its ChatGPT competitor, a demo infamously showed Bard making a blatantly false statement, claiming that NASA’s James Webb Space Telescope took the first image of an exoplanet.
As such, Google’s latest gaffe certainly doesn’t bode well. The company came out swinging this week, claiming that an even more capable version of its latest model called Gemini Ultra was able to outsmart OpenAI’s GPT-4 in a test of intelligence.
But from what we’ve seen so far, we’re definitely going to wait and test it out for ourselves before we take the company’s word.
More on Gemini: Google Shows Off “Gemini” AI, Says It Beats GPT-4
Share This Article