SoundHound AI is leveling up its voice assistant technology by integrating visual capabilities, introducing a new system called Vision AI. This innovative approach combines audio and visual input to create a more natural and intuitive user experience, mirroring the way humans interact with the world.
Vision AI analyzes live camera feeds in conjunction with voice commands, allowing devices to better understand user intent within their environment. Potential applications include providing mechanics with real-time visual and audio guidance while examining equipment through smart glasses, or enabling retail staff to efficiently manage inventory by visually scanning shelves. This development builds on SoundHound AI’s recent update to its AI agent, Amelia 7.1, which focused on increased speed and accuracy.
Keyvan Mohajer, CEO of SoundHound AI, highlights the company’s focus on creating integrated and responsive AI solutions designed for real-world application. By fostering a more intuitive and collaborative relationship between humans and technology, SoundHound AI aims to deliver faster service, reduce errors, and improve overall customer experiences.