You want background music for a YouTube video, a game level, or a side project — but you don't want to pay for stock audio or hire a composer. That's exactly the problem Meta's MusicGen solves. This MusicGen API tutorial walks you through generating music from a text prompt, in both Python and JavaScript, using a free hosted endpoint that doesn't need a credit card.
By the end, you'll have working code that turns prompts like "lofi hip hop with rain sounds, 90 bpm" into actual playable audio files. No paid keys, no signup gymnastics.
I'll use the Hugging Face Inference API as the host for the facebook/musicgen-small model. It's the easiest way to call MusicGen without spinning up your own GPU server.
What Is the MusicGen API?
MusicGen is an open-source audio generation model released by Meta in 2023. You give it a text description, and it returns a short music clip — usually 8 to 30 seconds depending on the model size and your settings.
The model itself is on GitHub, but running it locally needs a beefy GPU. So most developers call it through a hosted endpoint. Hugging Face offers a free Inference API tier that wraps MusicGen and returns raw audio bytes you can save as a WAV file.
One thing to know upfront: the free tier has rate limits. Hugging Face caps free Inference API calls at roughly 1,000 requests per day for anonymous use, and the model can take 20–60 seconds to respond when it's cold. That's not a bug — that's the trade-off for not paying.
Why Use the MusicGen API?
- Free for hobby projects. No card, no trial expiry — just a free Hugging Face account for higher limits.
- No GPU needed. The heavy lifting happens on Hugging Face's servers.
- Simple HTTP interface. One POST request, audio bytes back. That's it.
- Good prompt control. You can specify genre, instruments, tempo, and mood in plain English.
- Commercially usable output. MusicGen is released under a permissive license for the model weights, though always check the latest terms before shipping commercially.
Step-by-Step Setup
You need Python 3.8 or newer and the requests library. That's it for Python. For the JavaScript example, you need Node.js 18 or newer — earlier versions don't have native fetch().
Install the Python dependency:
pip install requests
You'll also want a Hugging Face account so you can grab a free token. Anonymous calls work too, but they're rate-limited harder and time out more often. Go to huggingface.co, sign up, then visit Settings → Access Tokens and create a read token.
MusicGen API Tutorial: Your First Request
Here's the smallest working example. It sends a prompt, gets back audio bytes, and saves them to a file.
Python Example: Basic Fetch
import requests
# Hugging Face hosted endpoint for the small MusicGen model
API_URL = "https://api-inference.huggingface.co/models/facebook/musicgen-small"
# Free token from huggingface.co/settings/tokens (read access is enough)
HEADERS = {"Authorization": "Bearer hf_your_token_here"}
# The text prompt describing the music you want
payload = {"inputs": "upbeat 8-bit chiptune for a retro platformer game"}
response = requests.post(API_URL, headers=HEADERS, json=payload)
# The response body is raw audio bytes (WAV format)
with open("output.wav", "wb") as f:
f.write(response.content)
print(f"Saved {len(response.content)} bytes to output.wav")
Run that, wait 20–40 seconds, and you'll get an output.wav file in your folder. Open it in any audio player.
One catch: if the model is cold (hasn't been used recently), the first call returns a JSON response telling you it's loading. We handle that in the next example.
Python Example: Practical Version with Error Handling
import requests
import time
import json
API_URL = "https://api-inference.huggingface.co/models/facebook/musicgen-small"
HEADERS = {"Authorization": "Bearer hf_your_token_here"}
# MusicGen small model max output is ~30 seconds per request
# Free tier: roughly 1000 requests/day, cold start can take 60s
def generate_music(prompt, output_file="track.wav", max_retries=3):
payload = {"inputs": prompt}
for attempt in range(max_retries):
try:
response = requests.post(
API_URL,
headers=HEADERS,
json=payload,
timeout=120 # generation can be slow
)
response.raise_for_status()
# If the model is loading, the response is JSON, not audio
content_type = response.headers.get("content-type", "")
if "application/json" in content_type:
error_info = response.json()
wait = error_info.get("estimated_time", 20)
print(f"Model loading. Waiting {wait:.0f}s...")
time.sleep(wait)
continue
# We got audio bytes — write them to disk
with open(output_file, "wb") as f:
f.write(response.content)
size_kb = len(response.content) / 1024
print(f"Success: {output_file} ({size_kb:.1f} KB)")
return output_file
except requests.exceptions.Timeout:
print(f"Timeout on attempt {attempt + 1}. Retrying...")
except requests.exceptions.HTTPError as e:
print(f"HTTP error: {e.response.status_code} - {e.response.text[:200]}")
return None
print("Gave up after retries.")
return None
# Try a couple of prompts
generate_music("calm acoustic guitar with soft piano, 70 bpm", "calm.wav")
generate_music("epic orchestral battle theme with brass and percussion", "battle.wav")
Sample output from a successful run:
Model loading. Waiting 23s...
Success: calm.wav (313.4 KB)
Success: battle.wav (308.7 KB)
The first call hits the cold-start case, waits, then retries. The second call reuses the warm model and returns almost instantly. This pattern is the difference between a script that "sometimes works" and one you can trust.
JavaScript Example: Generate Music with Fetch and Error Handling
// Node.js 18+ or any modern browser
import fs from "fs";
const API_URL = "https://api-inference.huggingface.co/models/facebook/musicgen-small";
const TOKEN = "hf_your_token_here";
async function generateMusic(prompt, outputFile = "track.wav") {
const payload = { inputs: prompt };
try {
const response = await fetch(API_URL, {
method: "POST",
headers: {
"Authorization": `Bearer ${TOKEN}`,
"Content-Type": "application/json"
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`Request failed — HTTP ${response.status}`);
}
// Check whether we got JSON (model loading) or audio bytes
const contentType = response.headers.get("content-type") || "";
if (contentType.includes("application/json")) {
const info = await response.json();
const wait = info.estimated_time ?? 20;
console.log(`Model loading. Wait about ${Math.round(wait)}s and retry.`);
return;
}
// Read the audio bytes and save to disk
const buffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync(outputFile, buffer);
const sizeKB = (buffer.length / 1024).toFixed(1);
console.log(`Saved ${outputFile} (${sizeKB} KB)`);
} catch (error) {
console.error("Generation failed:", error.message);
}
}
generateMusic("lofi hip hop beat with rain sounds, 90 bpm", "lofi.wav");
Sample console output:
Model loading. Wait about 22s and retry.
// After retry:
Saved lofi.wav (312.8 KB)
Understanding the Output
Most APIs return JSON. MusicGen is different — when it works, it returns raw binary audio. Here's what you'll actually see depending on the state of the model:
- Success: Response body is a WAV file. Content-type is
audio/wavoraudio/flac. You write the bytes straight to disk. - Model loading: Response is JSON like
{"error": "Model is currently loading", "estimated_time": 23.4}. Wait, then retry. - Rate limited: HTTP 429. Slow down or upgrade your account.
- Invalid token: HTTP 401. Check your token has read permission.
The audio itself is mono, 32 kHz, around 10 seconds long for the small model. File size lands around 300 KB per clip. Good enough for prototypes, game jams, and background loops.
Error Handling: What Actually Breaks
Real talk — the free tier has quirks. Here are the ones you'll hit.
Cold starts. If nobody has used the model in the last few minutes, your first call returns JSON with an estimated_time field. Don't treat this as an error — it's the API telling you to wait. The retry loop in the practical example handles it cleanly.
Timeouts. Generation can take 30–60 seconds. Set your client timeout to at least 120 seconds. The default 10-second timeout in requests will fail every time on cold starts.
Rate limits. The free Inference API caps anonymous users hard and signed-in users at roughly 1,000 calls per day. If you hit HTTP 429, back off for a few minutes. Don't hammer the endpoint in a loop.
Prompt length. Prompts longer than about 256 characters get truncated. Keep them short and specific. "Jazz piano trio, slow tempo, melancholic" works better than three sentences of description.
Wrong audio length. The musicgen-small model defaults to about 10 seconds. To get longer clips, you need to either run the model locally with custom parameters or use a different host that exposes the duration parameter.
Real-World Use Cases
Game development. Generate placeholder background tracks while you prototype levels. Swap them for licensed music later, or keep the AI versions if they fit.
Video content. Quick intro stings and outros for YouTube or TikTok videos. Faster than browsing royalty-free libraries when you know the vibe you want.
Podcast bumpers. Short transition clips between segments. Generate five variations, pick the best one.
App soundscapes. Meditation apps, focus timers, or ambient backgrounds for productivity tools. The model handles "calm" and "ambient" prompts well.
Comparison: Free Audio Generation APIs
| API | Free Tier Limit | Max Clip Length | Auth Required |
|---|---|---|---|
| MusicGen (Hugging Face) | ~1,000 req/day | ~10 sec (small model) | Free token recommended |
| Replicate MusicGen | ~50 req/month free credit | 30 sec configurable | API key required |
| Stable Audio Open (HF) | ~1,000 req/day | ~11 sec | Free token recommended |
| AudioCraft (self-hosted) | Unlimited | Configurable | Local GPU needed |
FAQ
Is the MusicGen API really free?
Yes, through Hugging Face's Inference API. You get roughly 1,000 requests per day with a free account. For higher volume or commercial production loads, you'll want a paid Hugging Face PRO plan or self-hosting.
Can I use MusicGen output commercially?
Meta released MusicGen under a permissive license, but the situation around AI-generated music ownership is still evolving. For commercial projects, read the latest Meta MusicGen license terms and the Hugging Face usage policy before publishing.
How do I generate music with AI for free without coding?
You can use the Hugging Face web Space for MusicGen at huggingface.co/spaces. But if you want to batch-generate or integrate into an app, the API is the way. This MusicGen API tutorial focuses on the coding path.
What's the difference between Meta MusicGen Python library and the API?
The Meta MusicGen Python library (audiocraft) runs the model locally and needs a GPU with at least 16 GB VRAM for the large model. The hosted API runs on Hugging Face's servers — slower per call, but no hardware required.
Why does my first request take so long?
Cold start. Hugging Face spins down models that haven't been used recently. When you call it again, the server has to load the model into memory, which takes 20–60 seconds. Subsequent calls are much faster.
Can I make longer tracks than 10 seconds?
Not directly with musicgen-small on the hosted API. You can generate multiple clips and stitch them, or switch to musicgen-medium or musicgen-large, or run the model locally to set a custom duration parameter.
Conclusion
You now have a working pipeline for generating AI music from text prompts. Python script, JavaScript version, error handling for cold starts and timeouts, and a sense of where the free tier limits actually bite.
Next step: experiment with prompts. Try genre + tempo + instrument combos. Try mood descriptors. Save what works in a prompt log. The model rewards specific prompts way more than vague ones.
Want to find more free APIs to plug into your next project? Browse the Free API Hub directory for hand-picked options that need zero credit cards.










