Fraudsters Can Now Use AI to Fake Any Voice
When a powerful new technology such as AI is introduced into society without any regulation, there will be people who will use this tool to their advantage: and it is not surprising that AI makes the work of fraudsters much easier and more believable. We wrote about the widespread “virtual kidnapping” scam , where scammers use shared voice recordings of children in distress to call families, hoping that victims will think their child is on the other end of the line. But now the scammers are actually using your child’s real voice with the help of AI voice generators.
What is an AI voice generator?
Here is how Voicemod , a Spanish sound effects software company, defines AI voice generator technology:
AI voice is a synthetic voice that mimics human speech using artificial intelligence and deep learning. These voices can be used for text-to-speech, for example with Voicemod Text to Song or speech-to-speech, which is how our AI voice collection works.
The technology is currently being used to improve what the Federal Trade Commission classifies as “impostor scams,” in which scammers pose as a family member or friend to swindle money from victims, usually older people. In 2022, there were over 5,100 phone imposter scam reports, totaling $11 million in losses, according to the Washington Post .
There are services readily available for people to generate AI voices such as Voicemod with very little control. According to Ars Technica, Microsoft’s VALL-E text-to-speech AI model claims to be able to imitate any voice in just three seconds of audio.
“It can then recreate the pitch, timbre, and individual sounds of a person’s voice to create a similar overall effect,” Honey Farid, professor of digital forensics at the University of California, Berkeley, told The Washington Post . “This requires a short audio sample taken from places like YouTube, podcasts, commercials, TikTok, Instagram or Facebook videos.”
How does the AI voice generator scam work?
Fraudsters use AI voice generator technology to imitate someone’s voice, often a small child, in order to trick a relative into thinking the child is being held for ransom. The scammer demands a sum of money in exchange for the safe release of the child.
As you can see from this NBC News report , it’s easy to take samples of a person’s voice from social media and use them to generate whatever you want to say. It’s so realistic that the reporter was able to trick colleagues into thinking that the AI generator was actually her, and one of them agreed to lend her his corporate card to make some purchases.
The scam is incredibly realistic and people are caught off guard. There are also various variants using the same technology. A Canadian couple lost $21,000 ($15,449 in US dollars) after a fake phone call from a “lawyer”. while their son “was in jail,” according to the same Washington Post report.
What can you do to avoid falling for the AI voice generator scam?
The best line of defense is awareness. If you are aware of the scam and understand how it works, you are more likely to recognize it if it happens to you or one of your loved ones.
If you receive such a phone call, you should immediately call or video chat with the child or “victim” who has been allegedly abducted. As counterintuitive as it may sound, instead of contacting the authorities or reaching for a credit card to hand over ransom money, call them yourself and it will shatter the fraudulent illusion. If this is in fact a scam, you will hear or see the “victim” on the other line go about his normal day.
Unfortunately, apart from these solutions, there is little you can do to avoid being targeted. There is technology that can detect AI activity in video, images, audio and text, but it is not applicable and not available to the public to help in situations like this. The hope is that the same technology that created this problem will eventually create a self-management solution.