Gemini Live Can Now “see” Your Phone (to Some Extent)

Gemini Live is the natural conversation mode in the Google Gemini app that just got a major update: AI can now instantly answer questions about what it sees through your phone’s camera and on your phone’s screen in real time. This feature will first appear on Google Pixel 9 and Samsung Galaxy S25 phones.

You’ve been able to offer photos and screenshots for Gemini analysis for a long time, but it’s the real-time updating aspect that makes this most interesting – as if the AI ​​bot could actually see the world around you. You may remember that some of these features were demonstrated by Google under the name Project Astra last year.

There are many ways to use Gemini Live. 1 credit

Samsung says it “feels like a trusted friend who’s always there to help,” while Google says you can use the improved features to get personalized shopping advice, troubleshoot problems, or declutter your space. You can talk to Gemini Live about anything you can point your camera at.

It’s now available as a free update on Pixel 9 and Galaxy S25 phones, and other Android devices will soon receive it, although wider availability will be linked to a Gemini Advanced subscription. There is currently no exact list of phones that will receive the update, although it appears to require a certain level of local processing power to work. There is no word yet on whether this will appear in the Gemini iPhone app.

As always, the official advice is to “check answers for accuracy,” so having a fancy new interface doesn’t mean Gemini’s AI is any more reliable than it was before. You will also need an active internet connection to work so that the app can get some help from the internet.

Added two new buttons for camera and screen sharing. 1 credit

This feature is easy to find: you can launch the Gemini Live interface by clicking the button on the far right of the input field in any Gemini chat (it looks a bit like a sound wave). From there, you’ll see two new icons at the bottom: one for accessing the camera (video camera icon) and one for accessing the phone screen (arrow inside a rectangle).

Close the Gemini Live interface and you’ll find that your conversation is recorded as a standard text chat, so you can return to it when needed. When new features came out on my Google Pixel 9, I tested them using questions I already knew the answers to to check for unhelpful hallucinations.

Gemini Live Challenge

First, I loaded up the camera interface and asked Gemini Live about the Severance episode I was watching on my laptop. The AI ​​initially thought I was watching You – presumably confusing Penn Badgley with Adam Scott – but it quickly corrected its mistake by identifying the correct show and naming the actors on screen.

Then I asked about the packaging labeled UN3481 : lithium-ion batteries packaged inside the equipment (in this case, in-ear headphones). Gemini Live correctly identified the lithium-ion batteries as requiring “special care” when handling, but did not provide any additional information. When clicked, it said that these batteries were packaged separately and not in the equipment. Wrong answer, Gemini Live – you mean code UN3480.

What are your thoughts so far?

Gemini Live figured out how to reset Charge 6 (this is a transcript of a live conversation). 1 credit

Gemini Live was also able to tell me how to reset my Fitbit Charge 6 when I pointed my phone camera at it (even though the AI ​​initially thought it was a Fitbit Charge 5, which is an easy enough mistake). It’s easy to see how this could come in handy if you’re trying to troubleshoot gadgets and aren’t quite sure about device brands and model numbers.

Screen sharing with Gemini Live is fun. The app shrinks down to a small widget so you can use your phone as usual and then ask questions about anything on the screen. Gemini Live did a good job of identifying which apps I was using, as well as some of the content within those apps, such as movie posters and band photos. He also accurately translated a social media post into a foreign language for me.

As for the website showing the recent Leicester v Newcastle football match, Gemini Live correctly told me what the score was and which players scored the goals – all information that was already on the screen. However, when I asked when the match would be, the AI ​​got confused and said it was May 22, 2023 (the same teams were playing, but almost two years ago ).

Gemini Live can see what’s happening on your phone screen with your permission. 1 credit

There were no complaints about the speed with which Gemini Live returned with responses, or the calm and reassuring manner in which they responded, but there were still issues with the quality of the results. Of course, usability is about pointing the camera and saying, “How do I fix this?” instead of creating a complex Google query – this means that many people may well choose to use it even with errors, but it is still a concern.

Essentially, it’s just a better instant version of visual search: Previously, you could just type “label UN3481” into Google for the same query. But while a traditional list of blue links in search results allows you to see the information you’re looking for and judge its reliability and authority, Gemini Live is more of a closed box that doesn’t show how it works. While it seems almost magical at times, the interface’s nature means having to double-check everything it says isn’t ideal.

More…

Leave a Reply