Stable Audio 2.0
Stability AI
• Framework: PyTorchStable Audio 2.0 is an advanced open-source AI model developed by Stability AI for generating music and audio from textual descriptions. Built with PyTorch and licensed under MIT, it offers creators and developers an accessible tool to produce diverse audio content, including music composition and sound design, with high fidelity and creativity.
Stable Audio 2.0 AI Model

Model Performance Statistics
Views
Released
Last Checked
Version
- Text-to-Audio
- Music Generation
- Parameter Count
- N/A
Dataset Used
AudioSparx, FreeSound
Related AI Models
Discover similar AI models that might interest you
FastSpeech 2

FastSpeech 2
Microsoft Research Asia
FastSpeech 2 is an improved neural text-to-speech model from Microsoft that generates natural-sounding speech quickly and efficiently. Built with PyTorch and licensed under MIT, it enhances prosody modeling and robustness, making it suitable for real-time voice assistants, audiobooks, and accessibility tools. The open-source code allows developers to customize and deploy the model easily.
VITS

VITS
NVIDIA
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is an advanced speech synthesis model developed by NVIDIA. It combines variational autoencoders and GANs to generate high-quality, natural-sounding speech directly from text. Built on PyTorch and licensed under MIT, VITS supports fast, end-to-end training and inference, making it popular for voice assistants and media applications.
MusicGen

MusicGen
Meta AI
MusicGen is a cutting-edge, single-stage autoregressive transformer AI from Meta AI via the AudioCraft library. Trained to generate high-quality music conditioned on text or melody (via EnCodec tokenizer), it supports multiple model sizes like small, medium (1.5B), and large (3.3B). Licensed under MIT for code and CC-BY-NC-4.0 for weights, it enables controllable, high-fidelity music synthesis across genres.