Category7free & open source
πŸŽ™οΈ

Audio & Voice

AI tools for text-to-speech in 60+ languages, zero-shot voice cloning, podcast editing with filler-word removal, AI music generation from text, and professional audio enhancement.

9Tools
Most Popular In
Text-to-SpeechVoice CloningPodcast Editing
Notable Developers
ElevenLabsMurf AISuno AIUdioDescript
Updated Jun 12, 2026
Curated by FreeAPIHub editors
Topics:Text-to-SpeechVoice CloningPodcast EditingAI Music GenerationAudio EnhancementVoice Conversion
9 of 9
⭐Top Resources
Stable Audio Open logo

Stable Audio Open

Tool Β· Stability AI
Open Source

Stable Audio Open is Stability AI's open-weight model for generating sound effects, samples and short instrumental audio from text prompts. Run it free on your own hardware and fine-tune it on your sounds.

Open weights usersNot rated yetView
Cleanvoice AI logo

Cleanvoice AI

Tool Β· Cleanvoice
Freemium

Cleanvoice is an AI audio editor that automatically cleans up podcasts and recordings. It removes filler words, stutters, mouth sounds and dead air, and can level audio - turning a raw take into a clean episode.

100K+ usersNot rated yetView
Suno AI logo

Suno AI

Tool Β· Suno
Freemium

Suno is an AI music generator that creates full songs - vocals, lyrics and instruments - from a text prompt. Describe a style and topic and get a complete, original track in seconds, no instruments required.

Millions usersNot rated yetView
Whisper logo

Whisper

Tool Β· OpenAI
Open Source

Whisper is OpenAI's open-source speech-to-text model. Run it free on your own machine for unlimited transcription in ~99 languages, or call the hosted API at $0.006/minute.

Millions usersNot rated yetView
Descript logo

Descript

Tool Β· Descript
Freemium

Descript edits audio and video by editing text. Transcribe a recording, then cut, rearrange and fix it like a document, with AI tools for filler-word removal, voice cloning and studio-quality sound.

Millions usersNot rated yetView
PlayHT logo

PlayHT

Tool Β· PlayHT
Freemium

PlayHT (Play.ai) is an AI voice platform for realistic text-to-speech and voice cloning. Generate lifelike voiceovers in many languages, clone a voice, and build real-time voice agents through its API.

Millions usersNot rated yetView
Murf AI logo

Murf AI

πŸ”₯ Hot
by Murf AI

Murf is an AI voice studio for professional voiceovers. Generate natural speech in many voices and languages, sync it to slides or video, clone a voice, and dub content - with an API for developers.

FreemiumView tool
ElevenLabs logo

ElevenLabs

πŸ”₯ Hot
by ElevenLabs

ElevenLabs is a leading AI voice platform for lifelike text-to-speech, voice cloning, dubbing and a conversational voice API. It supports many languages and powers audiobooks, agents and video voiceovers.

FreemiumView tool
Krisp logo

Krisp

πŸ”₯ Hot
by Krisp

Krisp is an AI voice tool that removes background noise, echo and other voices from calls in real time. It works with any conferencing app, and adds meeting transcription, notes and an AI meeting assistant.

FreemiumView tool
Showing 9 of 9 resources

About this category

Audio & Voice β€” developer guide

What Are AI Audio and Voice Tools?

AI Audio and Voice tools handle every stage of the audio content lifecycle β€” generating natural speech, cloning voices, editing recordings, creating music, and enhancing audio quality. What previously required a recording studio, professional voice talent, and a mixing engineer can now be done with a text prompt and a browser tab. These tools are used by podcasters, game developers, corporate training teams, musicians, and accessibility engineers building voice interfaces.

What Creators and Developers Build

  • Multilingual voiceover production for videos and e-learning courses at a fraction of studio cost
  • Voice-cloned AI assistants that sound like a specific brand persona
  • Podcast editing automation that removes filler words, silences, and background noise
  • AI-generated soundtracks and background music for games and apps
  • Accessibility features β€” natural-sounding screen readers for visually impaired users
  • Interactive voice response (IVR) systems with human-sounding synthetic speech

Top AI Audio Providers

ElevenLabs leads on voice quality and cloning β€” it generates studio-grade speech in 32 languages with fine-grained emotional control and requires only a 1-minute sample for voice cloning. Murf AI is the enterprise choice for team-based voiceover production with a built-in studio editor. Suno AI and Udio generate full songs with vocals and instrumentation from a text prompt β€” popular for content creators and indie game developers. Descript combines AI transcription, filler-word removal, and multi-track editing in one desktop app. All offer free tiers with meaningful usage limits for prototyping.