Distil-Whisper
Hugging Face
• Framework: PyTorchDistil‑Whisper is a distilled version of OpenAI’s Whisper model created by Hugging Face. Implemented in PyTorch and licensed under MIT, it offers up to six times faster inference and uses under half the parameters while maintaining ≤ 1% word error rate (WER) on English speech tasks. Ideal for real-time transcription in constrained resource environments.
Distil-Whisper AI Model

Model Performance Statistics
Views
Released
Last Checked
Version
- Speech-to-Text
- Parameter Count
- N/A
Dataset Used
LibriSpeech, Common Voice
Related AI Models
Discover similar AI models that might interest you
SpeechT5

SpeechT5
Microsoft
SpeechT5 is a versatile speech processing model developed by Microsoft, designed to handle speech recognition, speech synthesis, and speech translation tasks within a unified framework. Built using PyTorch and released under the MIT license, it leverages transformer architectures for improved accuracy and flexibility in various speech applications, including voice assistants and translation systems.
wav2vec 2.0

wav2vec 2.0
Meta AI
wav2vec 2.0 is a self-supervised speech representation learning model developed by Meta AI, offering state-of-the-art performance in automatic speech recognition (ASR). Built on PyTorch and licensed under MIT, it drastically reduces the need for labeled data, making it ideal for multilingual transcription and voice applications. The model is widely used and integrated into the Hugging Face ecosystem.
DeepSpeech

DeepSpeech
Mozilla
DeepSpeech is an open-source automatic speech recognition (ASR) model developed by Mozilla, utilizing TensorFlow and licensed under the Mozilla Public License 2.0. It enables developers to build reliable, real-time speech-to-text transcription systems optimized for multiple languages and accents. Its architecture is designed for efficient deployment on edge devices and supports custom language model training.