- Hackers Utilizing AI to Replicate Voices in a New Wave of Deepfake Technology
- Personal Encounter: Allowing a Hacker to Clone My Voice, Unveiling Alarming Outcomes
Authored by Shivali Best and Jordan Saward For Mailonline
Our voices, akin to our fingerprints, are remarkably unique. Imagine the disconcerting notion of someone cloning your voice.
Recently, a novel form of deepfake, termed voice cloning, has surfaced. In this technique, hackers employ artificial intelligence (AI) to mimic and replicate your voice.
Renowned personalities such as Stephen Fry, Sadiq Khan, and Joe Biden have fallen prey to voice cloning. Shockingly, an unnamed CEO fell victim to a scam, transferring $243,000 after receiving a fraudulent phone call.
But how does this process unfold, and how convincing can it be?
To delve into this realm, I permitted a proficient hacker to clone my voice, leading to spine-chilling revelations.
Voice cloning, an AI methodology, enables hackers to extract an audio sample, train an AI model on the specific voice, and reproduce it.
Dane Sherrets, a Solutions Architect at HackerOne, elucidated to MailOnline, “Initially utilized for audiobooks and aiding individuals with speech impairments, voice cloning has now been embraced by Hollywood and unfortunately, by fraudsters.”
In its nascent stages during the late 1990s, voice cloning necessitated expertise in AI. However, over time, accessibility and affordability have surged, allowing almost anyone to engage with it, as per Mr. Sherrets.
“Even individuals with minimal expertise can execute voice cloning. With freely available open-source tools, the process can be completed in under five minutes,” he remarked.
Precautionary Measures Against Voice Cloning Scams
- Scrutinize audio for anomalies like pauses, unnatural phrasing, or background noise.
- Pose queries only the authentic individual could answer.
- Establish a unique code word with acquaintances for verification purposes.
To replicate my voice, Mr. Sherrets merely required a brief five-minute audio snippet. Opting to read aloud a Daily Mail article, I discovered that hackers could effortlessly extract audio from a phone call or social media video.
“It’s feasible during live calls, social media content, or podcasts—essentially, daily recordings or uploads,” Mr. Sherrets affirmed.
Upon receiving the audio clip, he fed it into an undisclosed tool for training. Subsequently, he could input text or speech directly into the tool to generate messages in my voice.
“With contemporary tools, I can infuse nuances, pauses, or other elements for a more authentic sound, enhancing its credibility in fraudulent scenarios,” he disclosed.
The initial voice clone of mine, devoid of pauses or inflections, was remarkably realistic. Perfectly emulating my American-Scottish accent, it requested, “Hey Mum, it’s Shivali. I’ve lost my bank card and need to transfer some money. Can you please send some to the account that just texted you?”
Adding pauses in the subsequent clip elevated the eeriness. Mr. Sherrets elaborated, “The extended pause and breath towards the end enhance the naturalness of the voice.”
Fortunately, my encounter with voice cloning was a mere demonstration. Nonetheless, Mr. Sherrets underscored the substantial risks associated with this technology.
“Some individuals have encountered fake kidnapping calls, where a distressed ‘child’ demands ransom. Moreover, targeted social engineering attempts on corporations are on the rise,” he cautioned.
“I replicated my CEO’s voice using the same technology. Given CEOs’ public presence, obtaining high-quality voice samples is effortless,” he added.
“Access to a CEO’s voice significantly streamlines breaching security measures or acquiring passwords. Organizations must acknowledge this looming threat.”
Thankfully, discerning signs can expose voice clones, as outlined by Mr. Sherrets.
“Indicators include unnatural pauses, incoherent speech patterns, or background artifacts. For instance, cloned voices recorded in noisy environments may exhibit interference,” he detailed.
However, as technology advances, detecting such anomalies may become arduous.
“Vigilance is paramount. Urgency in requests should raise suspicion, prompting verification. Establishing a unique verification term with acquaintances is advisable,” he advised.
Furthermore, monitoring one’s digital footprint and moderating online disclosures are crucial.
“Each upload expands the risk of audio exploitation. Awareness of potential misuse of personal audio assets is essential,” he emphasized.