Security experts are warning of surging threat actor interest in voice cloning-as-a-service (VCaaS) offerings on the dark web, designed to streamline deepfake-based fraud.
Recorded Future’s latest report, I Have No Mouth and I Must Do Crime, is based on threat intelligence analysis of chatter on the cybercrime underground.
Deepfake audio technology can mimic the voice of a target to bypass multi-factor authentication, spread mis- and disinformation and enhance the effectiveness of social engineering in business email compromise (BEC)-style attacks, among other things.
Read more on deepfakes: FBI: Beware Deepfakes Used to Apply for Remote Jobs.
Recorded Future warned that increasingly, out-of-the-box voice cloning platforms are available on the dark web, lowering the bar to entry for cyber-criminals. Some are free to use with a registered account while others cost little more than $5 per month, the vendor claimed.
Among the chatter observed by Recorded Future, impersonation, call-back scams and voice phishing are frequently mentioned in the context of such tools.
In some cases, cyber-criminals are abusing legitimate tools such as those intended for use in audio book voiceovers, film and television dubbing, voice acting and advertising.
One apparently popular option is ElevenLabs’ Prime Voice AI software, a browser-based text-to-speech tool, that allows users to upload custom voice samples for a premium charge.
However, in restricting the use of the tool to paid customers, the vendor has encouraged more dark web innovation, according to the report.
“It has led to an increase in references to threat actors selling paid accounts to ElevenLabs – as well as advertising VCaaS offerings. These new restrictions have opened the door for a new form of commodified cybercrime that needs to be addressed in a multi-layered way,” the report continued.
Fortunately, many current deepfake voice technologies are limited in generating only one-time samples that cannot be used in real-time extended conversations. However, an industry-wide approach is needed to tackle the threat before it escalates, Recorded Future argued.
“Risk mitigation strategies need to be multidisciplinary, addressing the root causes of social engineering, phishing and vishing, disinformation, and more. Voice cloning technology is still leveraged by humans with specific intentions – it does not conduct attacks on its own,” the report concluded.
“Therefore, adopting a framework that educates employees, users, and customers about the threats it poses will be more effective in the short-term than fighting abuse of the technology itself – which should be a long-term strategic goal.”
Infosecurity approached Recorded Future for further comment, but it was unwilling to provide anything beyond the report.