OpenVoice
MyShell
• Framework: PyTorchOpenVoice V2 is an open-source voice cloning and speech synthesis model developed by MyShell AI with contributions from MIT and Tsinghua University. Released under the MIT license in April 2024, it enables accurate tone color cloning, flexible style control (emotion, accent, rhythm), and zero-shot cross-lingual voice cloning across English, Japanese, Chinese, Spanish, French, and Korean using only a short reference audio
OpenVoice AI Model

Model Performance Statistics
Views
Released
Last Checked
Version
- Voice Cloning
- TTS
- Parameter Count
- N/A
Dataset Used
VCTK, LibriTTS
Related AI Models
Discover similar AI models that might interest you
wav2vec 2.0

wav2vec 2.0
Meta AI
wav2vec 2.0 is a self-supervised speech representation learning model developed by Meta AI, offering state-of-the-art performance in automatic speech recognition (ASR). Built on PyTorch and licensed under MIT, it drastically reduces the need for labeled data, making it ideal for multilingual transcription and voice applications. The model is widely used and integrated into the Hugging Face ecosystem.
SpeechT5

SpeechT5
Microsoft
SpeechT5 is a versatile speech processing model developed by Microsoft, designed to handle speech recognition, speech synthesis, and speech translation tasks within a unified framework. Built using PyTorch and released under the MIT license, it leverages transformer architectures for improved accuracy and flexibility in various speech applications, including voice assistants and translation systems.
FastSpeech 2

FastSpeech 2
Microsoft Research Asia
FastSpeech 2 is an improved neural text-to-speech model from Microsoft that generates natural-sounding speech quickly and efficiently. Built with PyTorch and licensed under MIT, it enhances prosody modeling and robustness, making it suitable for real-time voice assistants, audiobooks, and accessibility tools. The open-source code allows developers to customize and deploy the model easily.