Google’s Project Astra Is an Artificial Assistant That Can Respond to What It Sees
At I/O 2024, Google made a lot of interesting announcements in the field of artificial intelligence, but everyone is talking about Project Astra. Essentially, Project Astra is what Google calls an “advanced see-and-talk responsive agent.” This means that Google’s future AI will be able to get information about what’s going on around you, so you can ask questions and get answers in real time. It’s almost like an improved version of Google Lens.
Project Astra is being developed by Google’s DeepMind team, whose goal is to create artificial intelligence that can responsibly benefit humanity; this project is just one way to do that. Google says Project Astra is built on top of Gemini 1.5 Pro , which has received improvements in areas such as translation, coding, reasoning, and more. As part of this project, Google says it has developed prototypes of AI agents that can process information even faster by continuously encoding video frames and combining video and speech input into a timeline of events. The company also uses its speech models to improve the sound of its AI agents and expand the range of intonation.
Google has released a two-part demo video to demonstrate how Project Astra works. The first half of the video shows Project Astra running on a Google Pixel phone; the second half shows a new AI running on a prototype glasses device.
In the demo video, we see a user using their Pixel phone with the camera viewfinder open and moving their device around the room while asking the next-generation Gemini AI assistant, “Tell me when you see something that makes a sound,” and the AI responding , pointing to the speaker on the table. Other examples in the video include asking what a piece of code does on a computer screen, what area of town they are currently in, and coming up with a band name for a dog and her toy tiger.
While it will be a long time before we see the next generation of AI from Project Astra come into our everyday lives, it’s still pretty cool to see what the future holds.