Project Astra is a part of Google’s DeepMind project. It shows us a glimpse of how future assistants could truly act like fully-fledged assistants to help you understand the things in your surroundings in real time.
The video also showcases how you could annotate and make an arrow pointing toward a thing you want Gemini to tell you about in real-time. In fact, it not only tells you what the thing is, but also describes what the object does.
The demo also showcased Gemini describing what a part of code does when pointed toward the same. It quickly came out with the answer – “This code defines encryption and decryption functions”.
Similarly, pointing toward a neighborhood and asking “What neighborhood do you think I’m in?” resulted in Astra and Gemini exactly pointing out the area – “This appears to be the King’s Cross area of London” and giving context on what it’s famous for. It also seems to remember the position of things it sees, as it quickly points out the location of the narrator’s glasses in the video.
Besides, it can also solve complex technological problems and present solutions to them. It can understand the context with drawings and objects framed together, as it did with the two cat diagrams and a box and said it was Schrödinger’s cat.
Some of the things we noticed in the demo were the higher latency, and it’s unclear if you can interrupt when Gemini and Astra are explaining. Astra is a direct competitor to the recently-announcedGPT4o.
Other than that, the possibilities seem endless, and we really hope Astra makes it to end users someday. What would be the first thing you’d ask Gemini if you get access to Project Astra? Let us know in the comments below.