Google’s Gemini is making strides in audio perception, showcasing a nascent ability to identify various sounds. While still in its early stages, the AI has demonstrated accuracy in recognizing common audio cues such as clanging metal, opening and closing doors, and differentiating between a vacuum cleaner and a siren. One notable misidentification occurred when Gemini mistook a sliding door for the sound of water. The capability distinguishes Gemini from models like ChatGPT, which lacks this audio recognition feature. The initial report and subsequent discussion on this development can be found on Reddit: https://old.reddit.com/r/artificial/comments/1kkjwie/gemini_can_identify_sounds_this_skill_is_new_to_me/
Google’s Gemini Learns to ‘Hear’: AI Shows Promise in Sound Recognition
