FreeAPIHub
HomeAPIsAI ModelsAI ToolsComing SoonBlog
Favorites
FreeAPIHub

The central hub for discovering, testing, and integrating the world's best AI models and APIs.

Platform

  • Categories
  • AI Models
  • APIs

Company

  • About Us
  • Contact
  • FAQ

Help

  • Terms of Service
  • Privacy Policy
  • Cookies

© 2026 FreeAPIHub. All rights reserved.

GitHubTwitterLinkedIn
  1. Home
  2. Categories
  3. Multimodal
🔗

Multimodal

Explore 0 APIs and 8 AI models.

0 APIs 8 AI Models 8 Total

Other Categories

Agent FrameworksAI Art GenerationAnalyticsAnimeArtificial IntelligenceAuthenticationAutomationBioinformaticsBlockchainBooksCalendarCode GenerationCollaborationCommunicationComputer VisionDataDatabaseDevelopment
Multimodal
Baaivision

Emu2-Chat

Open SourcePyTorch

Emu2-Chat is a conversational AI model designed for engaging and context-aware chat interactions, optimized for natural language understanding and generating human-like responses across various domains.

Views
256
Favorites
0
Released
2023
Official URL
https://baaivision.github.io/emu2/
conversational
Multimodal
Tsinghua University

CogVLM

Open SourcePyTorch

CogVLM is an advanced open-source vision-language model developed by Tsinghua University, capable of handling various multimodal AI tasks.

Views
136
Favorites
0
Released
2023
Official URL
https://github.com/THUDM/CogVLM
Multimodal AI
Multimodal
DeepSeek AI

DeepSeek-VL

Open SourcePyTorch

DeepSeek-VL is a cutting-edge open-source multimodal AI model that integrates vision and language processing to enable tasks like image captioning, semantic search, and cross-modal retrieval.

Views
228
Favorites
0
Released
2024
Official URL
https://github.com/deepseek-ai/DeepSeek-VL
Multimodal AI
Multimodal
Baidu

ERNIE-ViL

Open SourcePaddlePaddle

ERNIE-ViL is a powerful multimodal AI model developed by Baidu that integrates vision and language understanding into a unified framework.

Views
101
Favorites
0
Released
2019
Official URL
https://github.com/PaddlePaddle/ERNIE
Multimodal AI
Multimodal
OpenAI

CLIP

Open SourcePyTorch

CLIP (Contrastive Language–Image Pretraining) is an open-source multimodal model developed by OpenAI that learns visual concepts from natural language supervision.

Views
192
Favorites
0
Released
2021
Official URL
https://openai.com/research/clip
Multimodal AIimage-text embedding
Multimodal
University of Wisconsin–Madison

LLaVA-NeXT

Open SourcePyTorch

LLaVA-NeXT is a next-generation multimodal large language model developed by the University of Wisconsin–Madison, building upon the LLaVA framework. It excels in visual perception and language understanding.

Views
75
Favorites
0
Released
2025
Official URL
https://llava-vl.github.io/
ai-modelsvision language AI
Multimodal
Meta AI

Chameleon 7B

Open SourcePyTorch

Chameleon 7B is a multimodal foundation model developed by Meta AI that unifies text, image, and code understanding within a single early-fusion transformer architecture.

Views
62
Favorites
0
Released
2025
Official URL
https://huggingface.co/facebook/chameleon-7b
llmreasoning LLMai-models
Multimodal
Microsoft

Kosmos-2.5

Open SourcePyTorch

Kosmos-2.5 is Microsoft’s multimodal AI model that integrates text, image, and audio understanding in a unified architecture.

Views
113
Favorites
0
Released
2025
Official URL
https://github.com/microsoft/unilm
enterprisevision language AI