Google’s parent company Alphabet is bringing together two of its most ambitious research projects — robotics and AI language understanding — in an attempt to make a “helper robot” that can understand natural language commands.
Since 2019, Alphabet been developing robots that can carry out simple tasks like fetching drinks and cleaning surfaces. This Everyday Robots project is still in its infancy — the robots are slow and hesitant — but the bots have now been given an upgrade: improved language understanding courtesy of Google’s large language model (LLM) PaLM.
Most robots only respond to short and simple instructions, like “bring me a bottle of water.” But LLMs like GPT-3 and Google’s MuM are able to better parse the intent behind more oblique commands. In Google’s example, you might tell one of the Everyday Robots prototypes “I spilled my drink, can you help?” The robot filters this instruction through an internal list of possible actions and interprets it as “fetch me the sponge from the kitchen.”
Yes, it’s kind of a low bar for an “intelligent” robot, but it’s definitely still an improvement! What would be really smart would be if that robot saw you spill a drink, heard you shout “gah oh my god my stupid drink” and then helped out.
Google has dubbed the resulting system PaLM-SayCan, the name capturing how the model combines the language understanding skills of LLMs (“Say”) with the “affordance grounding” of its robots (that’s “Can” — filtering instructions through possible actions).
Google says that by integrating PaLM-SayCan into its robots, the bots were able to plan correct responses to 101 user–instructions 84 percent of the time and successfully execute them 74 percent of the time. That’s a solid hit rate, but those numbers should be taken with a pinch of salt. We don’t have the full list of 101 commands so it’s not clear how constrained these instructions were. Did they really capture the full breadth and complexity of language we would expect a bonafide home helper robot to comprehend? It’s unlikely.
That’s because this is the huge challenge for Google and others working on home robots: real life is uncompromisingly messy. There are just too many complex commands we would want to ask a real home robot, from “clean up the cereal I just spilled under the couch” to “sauté the onions for a pasta sauce” (both commands that contain a vast amount of implied knowledge, from how to clean up cereal, to where the onions in the fridge are and how to prepare them, and so on).
It’s why the only home robot this century to achieve even a modicum of success — the robot vacuum cleaner — has but one purpose in life: suckin’ dirt.
As AI delivers improvements in skills like vision and navigation, we are now seeing new types of bots enter the market, but these are still purposefully limited in what they can do. Look at Labrador Systems’ Retriever bot, for example. It’s basically a shelf on wheels that moves items from one part of the house to another. There’s certainly a lot of potential in this simple concept — the Retriever robot could be incredibly useful for people with limited mobility — but we’re still a long way from the do-anything robot butlers of our dreams.