Google Gemini 1.5 Pro Just Got Ears

The latest version of Google’s artificial intelligence, Gemini 1.5 Pro, can now hear you.

Gemini is an updated Google bot formerly called Bard, and Gemini 1.5 Pro is the latest version of the model, available to a limited number of developers in February this year. Gemini 1.5 Pro has the ability to process text, code, video and (now) downloaded audio streams , including audio from videos, which it can listen to, analyze and extract information without corresponding written transcripts.

In practice, support for audio files means users can use Gemini 1.5 Pro to collect call information, transcribe recorded interviews, or analyze videos with audio—essentially any type of audio file. The AI ​​can process queries that include one hour of video, 11 hours of audio, 30,000 lines of code, or more than 700,000 words in a single stream.

Google is also making Gemini 1.5 Pro available as a public preview for those with access to Vertex AI, but there’s no public beta in sight yet. Currently, most users interact with Google’s artificial intelligence through the Gemini chatbot.

More…

Leave a Reply