Category
🧩

Embeddings

Text and multimodal embedding models for semantic search, RAG pipelines, clustering, and similarity retrieval — including NVIDIA NV-Embed-v2 (72.31 MTEB) and Qwen3-Embedding for multilingual use.

3AI Models
Most Popular In
OverviewPopularOpen Source
Notable Developers
NVIDIA (NV-Embed-v2)Alibaba (Qwen3-Embedding)BAAI (BGE)CohereNomic AI
Updated Jun 12, 2026
Curated by FreeAPIHub editors
Topics:Text EmbeddingsMultimodal EmbeddingsSemantic SearchRAG PipelinesMultilingual EmbeddingsEmbedding Fine-Tuning
3 of 3
BV

BGE v3

🔥 Hot
by Beijing Academy of AI (BAAI) · 8K ctx

BGE-M3 is a versatile open text-embedding model from the Beijing Academy of AI. It is multilingual (100+ languages), multi-functional (dense, sparse and multi-vector retrieval) and handles long inputs up to 8K tokens.

MIT~568M (M3)
View model
E5

E5-Mistral

🔥 Hot
by Microsoft Research · 32K ctx

E5-Mistral-7B-Instruct is a high-performing text-embedding model from Microsoft, built on Mistral-7B and trained partly on synthetic data. It produces strong multilingual embeddings for search and RAG, with a 32K context.

NE

Nomic Embed

🔥 Hot
by Nomic AI · 8K ctx

Nomic Embed is a fully open text-embedding model with a long 8192-token context. Trained with a completely open data and code pipeline, it matches or beats popular closed embedders while remaining lightweight and Apache-2.0 licensed.

Apache 2.0~137M
View model
Showing 3 of 3 resources

About this category

Embeddings — developer guide

What Are Embedding Models?

Embedding models convert text, images, audio, and documents into dense numerical vectors — lists of floating-point numbers that encode semantic meaning. Two pieces of content about the same topic will have similar vectors even if they share no words in common. This property powers semantic search (find documents by meaning, not keyword), RAG systems (retrieve relevant context before generating an LLM response), duplicate detection, clustering, and recommendation engines. Embedding models are the invisible foundation of most production AI features in 2025–2026.

What Developers Build With Embeddings

  • RAG (Retrieval-Augmented Generation) pipelines that fetch relevant document chunks before prompting an LLM
  • Semantic search engines that surface results by meaning rather than exact keyword match
  • Recommendation systems that match users to content, products, or other users by interest similarity
  • Duplicate and near-duplicate detection across large document collections
  • Zero-shot text classifiers that compare input to labelled class descriptions
  • Multilingual search indexes that work across languages without language-specific tuning

Top Embedding Models in 2026

On the MTEB benchmark, NVIDIA NV-Embed-v2 leads English retrieval at 72.31 average score. Qwen3-Embedding-8B (70.58 MTEB) is the best multilingual choice — it supports flexible output dimensions from 32 to 4,096, reducing storage costs for large indexes. BGE-en-ICL (BAAI) achieves 71.24 MTEB with in-context learning for domain adaptation. For managed APIs, OpenAI text-embedding-3-large (64.6 MTEB) and Cohere embed-v4 (65.2 MTEB, multimodal) are the most widely integrated. For self-hosted use, Nomic Embed Text offers an excellent quality-to-size ratio and is Apache-2.0 licensed.