In a critical step toward combating audio deepfakes, researchers have unveiled a “machine unlearning” technique enabling AI text-to-speech models to selectively forget specific voices. This breakthrough addresses the growing threat of audio deepfakes used in fraud and scams by allowing individuals to opt out of unauthorized voice replication.
The core of the innovation lies in teaching AI models to effectively redact their ability to mimic specific voices. Unlike traditional “guardrail” approaches, which act as preventative barriers around unwanted data, machine unlearning aims to create a new model that essentially never learned the targeted voice.
“Guardrails are like fences, whereas machine unlearning removes the problematic data altogether,” explains Jinju Kim, a master’s student at Sungkyunkwan University. The team, led by Professor Jong Hwan Ko, demonstrated their method using a recreation of Meta’s VoiceBox. The results, presented at the International Conference on Machine Learning, showed that the model could be prompted to respond with a random voice instead of mimicking an “unlearned” voice, leading to a significant reduction in voice similarity.
The process, which currently requires approximately five minutes of audio per voice and several days to complete, demonstrates the trade-offs between efficiency and forgetfulness. While the model’s ability to mimic permitted voices decreases slightly (by about 2.8%), the improvement in redacting unwanted voices is substantial. Vaidehi Patil, a PhD student at the University of North Carolina at Chapel Hill, acknowledges these inherent trade-offs.
Despite being in its early stages, the technology is generating significant industry interest. As the potential for real-world deployment grows, the need for faster, more scalable solutions becomes increasingly urgent to ensure individuals can safeguard their voices in an era of rapidly advancing AI capabilities.