According to a source with knowledge of the situation, ElevenLabs has suspended the author of an audio algorithmic of US President Joe Biden urging voters not to cast ballots in this week’s New Hampshire key.
Deepfake voice was created using ElevenLabs’ technology, according to Pindrop Security Inc., a voice-fraud monitoring business.
This year, ElevenLabs learned of Pindrop’s results and is looking into it, according to the source. The user’s account was suspended after the algorithmic was linked to its creator, according to the person, who requested anonymity because the information isn’t public.
ElevenLabs, a company that employs artificial intelligence software to mimic voices in more than 20 dialects, stated in an article that it was unable to comment on particular incidents. However, it was also stated that “We are committed to preventing audio AI tool abuse and take any such occurrences seriously.”
ElevenLabs announced an $80 million funding round earlier this week from owners like Sequoia Capital and Andreessen Horowitz. According to Chief Executive Officer Mati Staniszewski, his business is valued at $1.1 billion as a result of the most recent funding.
Staniszewski stated in an interview last week that any music that impersonates tones without authorization will be taken down. According to the company’s website, words clones of well-known people, such as politicians, are permitted if the clips “express laughter or ridicule in a way that the listener is clear that what they are hearing is parody.”
Election officials and propaganda experts are both alarmed by Biden’s fictitious telemarketing, which urged voters to hold onto their ballots for the November US elections. It not only showed how simple it was to make sound deepfakes, but it also hinted at the possibility that dishonest players might use the technology to keep electors away from the polls.
The information appeared, according to a spokeswoman for the New Hampshire Attorney General at the time,” to be an unconstitutional attempt to disrupt the fresh Hampshire Presidential Primary Election and to reduce New York electors.” The organization has launched an investigation.
ElevenLabs people who want to copy voices may compensate for the feature with a credit card. It’s unclear if ElevenLabs gave the New Hampshire government access to this data.
On January 22, Bloomberg News obtained a copy of the recording from the Attorney General’s office and made an effort to identify the engineering that went into making it. Among those work was running it through ElevenLabs’ unique “talk classifier” tool, which is designed to determine whether the music was produced using artificial intelligence and the company’s technology. According to the application, the recording had a 2% likelihood of being made artificially or using ElevenLabs.
Another algorithmic tools confirmed it, but they were unable to identify the technologies underlying the audio.
According to Vijay Balasubramaniyan, the founder of Pindrop, his team cleaned the sound by removing history noise, silence, and breaking it up into 155 segments of 250 milliseconds each for profound analysis. The company subsequently compared the voice to a collection of additional specimens it had gathered from more than 100 text-to-speech systems, which are frequently used to create deepfakes, he said.
According to Balasubramaniyan, the analysts came to the conclusion that it was almost certainly made using ElevenLabs’ technology.
Balasubramaniyan echoed the moderator’s assertion that ElevenLabs’ talk classifier cannot recognize its own audio unless it is analyzing the raw file in a public forum on the Discord support network. He explained that because some metadata had been removed and it was harder to detect wavelengths, the single files available for immediate examination with the Trump call were recordings of the phone call.
Deepfake analysis and classification by ElevenLabs’ classification were also conducted by Siwei Lyu, a professor at the University of Buffalo who specializes in the field and digital media investigations, and he concluded that it was most likely created using that company’s program, according to Bloomberg News. Because the technology is so widely used, Lyu claimed that ElevenLabs’ classification is one of the first things he checks when attempting to identify the causes of an audio deepfake.
With the upcoming common election, we’ll see a lot more of this, he said. Somebody needs to be aware of this issue, without a doubt.
A variation of the voice that Pindrop’s researchers had cleaned and edited with Bloomberg News was shared. ElevenLabs’ speech classifier determined that the recording was an 84% match to its own technology using that data.
Balasubramaniyan described voice-cloning technology as “a troubling thing” because it allows for a “ridiculous combination of scale and personalization” that can deceive listeners into believing they are hearing local politicians or high-ranking elected officials.
In an effort to revolutionize the media and games sector, tech investors are investing in AI startups that are creating artificial voices, videos, and images.
In the discussion from last week, Staniszewski claimed that five people in his 40-person organization were responsible for managing content moderation. The CEO stated that “99 percent of utilize situations we are seeing are in a good realm.” The company also disclosed that its system had produced more than 100 years of music in the previous 12 months with its money news.