The realm of music is poised for increased involvement of Generative AI, courtesy of a novel open-source solution just launched by Meta. Having already demonstrated proficiency in generating human-like conversations and text, this tool now extends its capability to generate music and audio, once more drawing inspiration from user-provided text prompts.
Credit goes to Meta‘s open-source AI code, AudioCraft, for this development. The tech titan unveiled this innovation on Wednesday, empowering users to craft lifelike, top-notch audio and music through concise text prompts. Comprising three distinct AI models – AudioGen, EnCodec, and MusicGen – AudioCraft excels in targeted music and sound compression and generation.
Meta now joins the ranks of companies merging generative AI with audio. Notably, earlier this year, Google’s parent company Alphabet also introduced its experimental audio generation AI tool called MusicLM.In its online publication, the social networking corporation disclosed that MusicGen’s training employs “20,000 hours of music either owned by Meta or licensed explicitly for this application.” The enhanced iteration, EnCodec, enables users to produce sounds with reduced artifacts. Meanwhile, AudioGen, currently hinged on public sound effects training, generates audio based on user-provided textual inputs.
Meta illustrated this with sample audio instances created using AudioCraft, including simulated sounds of whistling, sirens, music, and humming – all prompted by plain text.
“In the recent period, substantial advancements have occurred in generative AI models, particularly within the realm of language models, highlighting extraordinary competencies.” These encompass the capability to generate diverse images and videos using textual descriptions as a basis, and depictions, demonstrating spatial understanding, alongside text and speech models that excel in tasks such as machine translation and dialog interactions.
Despite the considerable enthusiasm surrounding generative AI for images, video, and text, the field of audio has somewhat lagged behind. Although there are existing endeavors, they tend to be intricate and lacking in accessibility, limiting people’s ability to readily engage,” the corporation stated in its publication.
The incorporation of generative AI in music composition holds extensive ramifications, influencing diverse facets of the music sector, creative articulation, and the broader society. This development brings forth captivating prospects for creativity and novelty. A prominent outcome is the capacity to enable musicians to forge novel compositions without instrumental performance. Through AudioCraft, artists gain the seamless capacity to venture into uncharted musical terrains, delving into distinctive sounds, melodies, and harmonies, thereby fostering experimentation and innovation.
Naturally, there are also valid concerns. The swift progress of the AI domain has sparked apprehension among many industry specialists, who rightfully raise alarms about the potential perils of generative AI and the breakneck speed at which the AI sector is progressing. The application of generative AI for music and sound creation brings forth inquiries regarding copyright and ownership. The query of who possesses the rights to AI-generated music – the AI developer, the musician utilizing the tool, or the AI model per se – comes to the forefront. (And the ethical deliberations constitute a separate complex issue entirely).