Written by 10:36 pm AI, Discussions, Uncategorized

### Unveiling Meta’s AI-Powered Audiobox: A Smart Sound Choice

The new tool aims to let users create diverse audio content using advanced AI, surpassing its contr…

On Monday, Meta, the parent company of Facebook, introduced the latest iteration of Audiobox, an innovative AI-driven audio generation platform. This new platform allows users to create distinct voices and sound effects by inputting words and prompts.

Audiobox, an evolution of Meta’s Voicebox program launched earlier this year, offers enhanced quality and incorporates automatic steganography for responsible usage. The platform combines speech generation and editing features for creating various audio elements such as sound effects (e.g., dog barks, car horns, thunder cracks) and soundscapes using diverse input methods.

Meta’s Audiobox team emphasized the use of tailored solutions that significantly expedite the creation process, claiming a speed increase of over 25 times compared to previous models without compromising performance.

Voicebox, Meta’s earlier AI tool capable of generating music in multiple languages, including English, French, German, Spanish, Polish, and Portuguese, was unveiled in June. It aims to mimic natural human speech patterns more accurately.

Due to concerns surrounding AI-generated deepfakes, Meta decided not to release Voicebox to the public initially. With Audiobox, Meta introduced hashing technology to prevent potential misuse.

Recent advancements in audio quality and fidelity have expanded the capabilities of the Audiobox model, enabling new applications and use cases. However, there are growing concerns within the Audiobox team regarding the potential misuse of the technology. To address this, the platform incorporates automatic sound hashing to trace generated audio back to its source accurately.

Meta’s watermarking technique embeds imperceptible messages in the audio, detectable by AI models at the frame level, enhancing traceability and control over the generated content.

The Audiobox team emphasizes the importance of accurate data labeling in training audio-generative AI models. They recommend detailed labeling of sounds, such as specific dog breeds, rather than generic terms like “dog barking,” to enhance model accuracy. Similarly, speech patterns and regional dialects should be precisely identified for optimal performance.

Meta’s commitment to ethical AI development is evident through its investments in artificial intelligence and collaborations with industry leaders like IBM to establish the AI Alliance. This alliance aims to foster open-source AI innovation and knowledge sharing among various stakeholders to advance societal benefits through cutting-edge AI technologies.

Visited 17,745 times, 1 visit(s) today
Last modified: February 6, 2024
Close Search Window
Close