open source

SeamlessM4T v2

Provided by: Framework: Unknown

SeamlessM4T v2 is Meta AI’s advanced multilingual speech and text translation model, designed for real-time translation across over 100 languages. It supports automatic speech recognition (ASR), text-to-text translation, and speech-to-speech translation within a single unified architecture. Compared to its predecessor, SeamlessM4T v2 improves latency, speech naturalness, and contextual accuracy. The model is optimized for cross-lingual communication, making it ideal for global communication tools, accessibility apps, and cross-border collaboration systems. As an open-source model under Meta’s Seamless Communication initiative, it enables developers to integrate high-quality multilingual translation and speech synthesis into diverse AI products.

Model Performance Statistics

0

Views

March 17, 2025

Released

Aug 19, 2025

Last Checked

2.0

Version

Capabilities
  • Speech-to-speech
  • Multilingual ASR
  • Text translation
Performance Benchmarks
WER5.2%
BLEU42.1
Technical Specifications
Parameter Count
N/A
Training & Dataset

Dataset Used

Unified Speech Translation Corpus

Related AI Models

Discover similar AI models that might interest you

Modelopen source

Fairseq

Fairseq

Fairseq

Meta AI

Fairseq is Meta AI’s open-source PyTorch-based toolkit for training sequence-to-sequence models, widely used in machine translation, text summarization, and other NLP applications.

Natural Language Processingnlptranslation
38
Modelopen source

DeepSpeech

DeepSpeech

DeepSpeech

Mozilla

DeepSpeech is an open-source automatic speech recognition (ASR) model developed by Mozilla, utilizing TensorFlow and licensed under the Mozilla Public License 2.0. It enables developers to build reliable, real-time speech-to-text transcription systems optimized for multiple languages and accents. Its architecture is designed for efficient deployment on edge devices and supports custom language model training.

Speech & Audiospeech-recognitionvoice
16
Modelopen source

Pix2Pix

Pix2Pix

Pix2Pix

UC Berkeley

Pix2Pix is an open-source image-to-image translation model developed by researchers at UC Berkeley. Based on conditional GANs and implemented in TensorFlow and PyTorch, Pix2Pix can convert sketches, segmentation maps, or black-and-white photos into realistic images. Widely used for artistic rendering, style transfer, and data augmentation, it’s a foundational model in generative vision tasks.

Computer Visiontranslation
16
SeamlessM4T v2 – Meta AI Multilingual Speech Translation – Free API Hub