A new software application named OpenAI has been introduced, enabling the replication of an individual’s voice using just a 15-second audio snippet.
Named Voice Engine, this technology requires an additional 15-second recording to familiarize itself with the nuances of the person’s voice and speech patterns. Users can then input text to generate speech that sounds natural, conveys emotions, and articulates any desired content. OpenAI disclosed that Voice Engine had been utilized with pre-set voices in 2022 and subsequently enhanced, marking the first instance of its application with authentic speech. The company also acknowledged the potential misuse of this technology in a blog post on Friday (March 29).
“Given the potential misuse of synthetic speech, we are proceeding cautiously and thoughtfully towards a wider deployment,” stated OpenAI in a website announcement. “We aim to initiate a dialogue on societal responses to these emerging capabilities and the responsible utilization of synthetic voices.”
OpenAI emphasized that the decision to release Voice Engine to the public at large would be contingent upon the outcomes of these deliberations.
“We will make an informed determination regarding the potential release and implementation of this technology on a larger scale,” the company affirmed.
The implications of Voice Engine are profound. While it holds promise for legitimate applications such as rapid transcription of lectures or enhancing communication, there exists a clear risk of manipulating speech for malicious intent. OpenAI noted that various forms of fraudulent activities already exist, aimed at deceiving individuals into divulging sensitive information or transferring funds to scammers.
In a demonstration of Voice Engine, OpenAI showcased the ability to generate lifelike speech resembling a specific speaker using only a brief audio snippet. https://t.co/yLsfGaVtrZMarch March 29, 2024
Learn More
OpenAI underscored the significance of receiving feedback due to the inherent risks associated with this technology. The company revealed ongoing engagements with governmental bodies, marketing firms, entertainment enterprises, and academic establishments in the United States and internationally to explore the potential of Voice Engine. Participants in these endeavors have committed to refraining from impersonating others during the testing of Voice Engine and have pledged to clearly indicate to listeners that the speech is AI-generated. OpenAI has also implemented watermarking to distinguish AI-generated voices.
OpenAI suggested that the broader adoption of artificial voice technology should be accompanied by voice authentication mechanisms that confirm the deliberate consent of the original speaker to utilize their voice and a blacklist feature to prevent the creation of voices resembling well-known personalities.
The future trajectory of Voice Engine remains uncertain. While it may eventually be released to the public, OpenAI reserves the right to withhold its deployment if deemed unfavorable. Regardless, the company emphasized the inevitability of technological advancement and the necessity for global awareness regarding the trajectory of such innovations, irrespective of their eventual adoption.