Stability AI Launches Stable Audio 2.0 With Audio-to-Audio Generation Feature
Written by Kristin Robinson on April 3, 2024
Stability AI has launched Stable Audio 2.0, adding key new functions to the company’s text-to-music generator. Now, users can generate tracks that are up to three minutes long at 44.1 KHz stereo from a natural language prompt like, “A beautiful piano arpeggio grows to a full beautiful orchestral piece” or “Lo-fi funk.”
Stable Audio 2.0 also features an audio-to-audio generation feature, allowing users to manipulate any audio sample they want using text-based AI prompts. Its terms of service, however, requires that any audio uploaded to this tool is free of copyrighted material, with the tool employing a content recognition filter to ensure compliance.
Stable Audio 2.0 is the company’s first major music product to launch since its vp of audio, Ed Newton-Rex, resigned. To announce his exit, Newton-Rex wrote a lengthy post on social media saying, “I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’… I hope others will speak up, either internally or in public, so that companies realise that exploiting creators can’t be the long-term solution in generative AI.”
Unlike some of the company’s other models, Stable Audio and Stable Audio 2.0 are only trained on training data it licensed from the music library AudioSparx. The library contains over 800,000 audio files containing music, sound effects and single-instrument stems as well as text metadata. All of the musicians who created the works in the AudioSparx library were given the option to opt out of being used to train Stable Audio’s model.
Stable Audio 2.0 is now available to use for free on the Stable Audio website and will soon be available on the Stable Audio API.