Gemini intelligence is coming to Google Home

/

Google Assistant is getting a major upgrade on Nest smart speakers and displays, and Nest cameras will soon be able to tell as well as show, as Google Home gets a powerful AI infusion.

Share this story

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

A pink room with a figure near a Nest Thermostat on the wall and a smart speaker on a table.

Google Home is getting a Gemini intelligence upgrade.
Image: Google Nest

While splashy chatbots may get all the attention, generative AI has real potential to make the smart home simpler and more accessible. Amazon has already announced its plans for a smarter Alexa to power your home. Now, it’s Google’s turn to promise that it can produce a better, smarter, more helpful Google Assistant. 

Ahead of its fall hardware event next week, Google announced three new Gemini intelligence-powered experiences it plans to bring to its Google Home smart home platform later this year. There’s a new camera intelligence feature that generates descriptive captions for video footage from Nest cameras, a natural language input for creating Google Home routines, and a smarter Google Assistant for Nest smart speakers and displays with an all-new voice. 

Most of these features — aside from the new voice — will be paywalled behind Google’s Nest Aware subscription, its video recording subscription for Nest cameras that starts at $8 a month ($80 a year). The features will launch first in Google’s Public Preview beta program to a limited number of Nest Aware subscribers and will roll out to more users next year. 

This is just the start of bringing more intelligence to the company’s smart home platform, Anish Kattukaran, Google Home’s head of product, told The Verge in an interview ahead of the announcements. “This sets the path for the next era of Google Home.”

Google Home’s new smart home hub, the Google TV Streamer 4K, is a Matter controller and Thread border router.

Google Home’s new smart home hub, the Google TV Streamer 4K, is a Matter controller and Thread border router.
Google Home’s new smart home hub, the Google TV Streamer 4K, is a Matter controller and Thread border router.
Image: Google Home

All of this will be welcome news for long-suffering Google Home users, many of whom are tired of dealing with underpowered, aging smart displays and seeing features they rely on get canceled. They’ve also been struggling through a laborious transition from the Nest app to the Google Home app.

This week’s launch of the Google TV Streamer 4K (which is a Google Home hub) and a new Nest Learning Thermostat, combined with the promise of a smarter Google Assistant, means things are starting to look good in Google’s hood. 

It also seems the Google Assistant is here to stay. Rather than transplanting Gemini onto Nest speakers and smart displays to control your smart home, Google is deploying Gemini intelligence behind the scenes. “Gemini is a family of models, and we’re optimizing it for elements of Google Home,” explains Kattukaran.

Smarter security camera alerts

The multimodal Gemini AI can understand what a camera sees and hears and produce a caption describing the action.

The multimodal Gemini AI can understand what a camera sees and hears and produce a caption describing the action.
The multimodal Gemini AI can understand what a camera sees and hears and produce a caption describing the action.
Image: Google Nest

Google is using Gemini intelligence on Nest cameras to allow them to understand what they see and hear and then tell you what’s most important. This means that instead of just getting an alert for a person or package and then having to watch the video to see what happened, Google Home will add a detailed description of what the camera saw. The models will learn and train on your data — in the cloud, but for your home — getting smarter over time to better understand what’s happening around your home.

One example Kattukaran shared was a clip of a person unloading groceries from a car with the caption:

A young person in casual clothing, standing next to a parked black SUV. They are carrying grocery bags. The car is partially in the garage and the area appears peaceful.

Interpretative details aside, the caption provides a lot of context, which, alongside being helpful, could translate to smarter home automation. For example, if a camera detects an animal and understands that “the dog is digging in the garden,” the next step could be to create an automation to “turn on the sprinklers.” 

You’ll be able to use text prompts to search your Nest cameras video footage for specific events.

You’ll be able to use text prompts to search your Nest cameras video footage for specific events.
You’ll be able to use text prompts to search your Nest cameras video footage for specific events.
Image: Google Home

There will also be an option to use text to search through footage in the Google Home activity tab. This could be handy when, say, my cat sneaks out after dark. I could ask it to show me the last time it spotted the cat rather than having to scroll through every video tagged with an animal to find him.

Home automation made easier

Gemini intelligence can parse natural language to create complex smart home automations.

Gemini intelligence can parse natural language to create complex smart home automations.
Gemini intelligence can parse natural language to create complex smart home automations.
Image: Google Home

A new “Help me create” feature in the Google Home app lets you describe what you want to happen — such as “lock the doors and turn off the lights at bedtime” — and have it create a routine to do it automatically.

You need to use the text or speech input in the Home app on your phone (it doesn’t work through Nest speakers), but Kattukaran says it will have all the current capabilities of the Google Home app. This includes all the current starters, conditions, and actions, plus access to any device connected to Google Home, including Matter devices. It’s not as complex or sophisticated as Google’s script editor, he says, but it should make creating automations easy for anyone to do. 

Google Assistant grows up and gets new voices

Google is launching a new voice for its Google Assistant.

Besides easier automations and camera intelligence, Google says it’s improving the “core experiences” of its Google Assistant — such as playing music and setting timers — on all current Nest smart speakers and displays.

Plus, Google Assistant is getting new voices with different styles, tones, and accents. The company released a demo of the first new voice engaging in some conversational back and forth. As you can hear in the video, it retains the female tone but sounds lighter and more natural.

Google Assistant should not only sound more natural but should also communicate more naturally. Kattukaran says it won’t need specific nomenclature to do what you want, can handle pauses, ums, and ahs, and answer follow-up questions. I didn’t see an in-person demo of this, but it sounds similar to the features Amazon announced for Alexa last fall (that have yet to arrive).

Kattukaran says the new Google Assistant will be able to maintain the context of your conversation and start to learn and understand your home. The Gemini-powered capabilities will run “in the cloud, for your home” in accordance with Google’s privacy principles, he says.

“It is specific to your home and your data models. We’re being very intentional about going slow. In the home, the margin for error is very low; we can’t mess up,” he says. The goal is for the models to build an understanding of your home — such as the rooms and devices you have — and then build on that baseline to get smarter over time.

These changes are designed to push the digital voice assistant closer to the vision Google and its competitors have been working toward for years: a digital assistant that can be genuinely helpful. 

“When we started out with that first-gen assistant, the promise was The Jetsons; the vision was an ultra helpful assistant that could proactively help you figure things out,” says Kattukaran. “We made a bunch of progress, then it plateaued — across all the assistants, not just us. We hit a technological ceiling. That’s been raised with LLMs and language models that are more multi-modal.” 

As Kattukaran points out, “The home is a beast.” It’s complicated and messy, with multiple characters and scenarios. It’s hard enough for a human to manage, making it a significant challenge for a computer. But it seems Amazon, Google, and Apple are now all racing toward a future where our homes have an intelligent, context-aware assistant that can help it respond to our needs. It’s going to be fascinating to see how this plays out. 

Go to Source