open source

MusicGen

Provided by: Framework: PyTorch

MusicGen is a cutting-edge, single-stage autoregressive transformer AI from Meta AI via the AudioCraft library. Trained to generate high-quality music conditioned on text or melody (via EnCodec tokenizer), it supports multiple model sizes like small, medium (1.5B), and large (3.3B). Licensed under MIT for code and CC-BY-NC-4.0 for weights, it enables controllable, high-fidelity music synthesis across genres.

Model Performance Statistics

13

Views

June 12, 2023

Released

Jul 20, 2025

Last Checked

v2

Version

Capabilities
  • Text-to-Music
Performance Benchmarks
FAD2.1
Length30 seconds
Technical Specifications
Parameter Count
N/A
Training & Dataset

Dataset Used

20K hours licensed music

Related AI Models

Discover similar AI models that might interest you

Modelopen source

FastSpeech 2

FastSpeech 2

FastSpeech 2

Microsoft Research Asia

FastSpeech 2 is an improved neural text-to-speech model from Microsoft that generates natural-sounding speech quickly and efficiently. Built with PyTorch and licensed under MIT, it enhances prosody modeling and robustness, making it suitable for real-time voice assistants, audiobooks, and accessibility tools. The open-source code allows developers to customize and deploy the model easily.

Speech & Audioaudiotext-to-speech
14
Modelopen source

VITS

VITS

VITS

NVIDIA

VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is an advanced speech synthesis model developed by NVIDIA. It combines variational autoencoders and GANs to generate high-quality, natural-sounding speech directly from text. Built on PyTorch and licensed under MIT, VITS supports fast, end-to-end training and inference, making it popular for voice assistants and media applications.

Speech & Audioaudiotext-to-speech
14
Modelopen source

Stable Audio 2.0

Stable Audio 2.0

Stable Audio 2.0

Stability AI

Stable Audio 2.0 is an advanced open-source AI model developed by Stability AI for generating music and audio from textual descriptions. Built with PyTorch and licensed under MIT, it offers creators and developers an accessible tool to produce diverse audio content, including music composition and sound design, with high fidelity and creativity.

Speech & Audioaudiomusic
14