Multimodal Models

Multimodal LLMs: Text, Image, Audio, and Beyond

Compare multimodal LLMs that go beyond text. These models process images, audio, and other input types alongside text, enabling rich applications like visual Q&A, document analysis, and creative content generation.

135 models·18 providers·$0.0300 to $150.00/1M input

Top Picks

Compare Models

vs
Compare

Popular comparisons

Related Categories

Cost per 1M Tokens / Flagship Models

Input
Output

All Multimodal Models

135 models
ProviderModelInput $/1MOutput $/1MContextQuality
MistralMistral Small 3.1 24b Instruct$0.0300$0.1100131.1K
63.0Strong
GoogleGemma 3 4b It$0.0400$0.0800131.1K
63.0Strong
GoogleGemma 3 12b It$0.0400$0.1300131.1K
69.9Frontier
MetaLlama 3.2 11b Vision Instruct$0.0490$0.0490131.1KN/A
QwenQwen3.5 9b$0.0500$0.1500256KN/A
OpenAIGPT 5 Nano$0.0500$0.4000400K
49.9Good
AmazonNova Lite V1$0.0600$0.2400300KN/A
QwenQwen3.5 Flash 02 23$0.0650$0.26001M
80.6Frontier
Bytedance-seedSeed 1.6 Flash$0.0750$0.3000262.1KN/A
MistralMistral Small 3.2 24b Instruct$0.0750$0.2000128KN/A
GoogleGemini 2.0 Flash Lite 001$0.0750$0.30001.0M
72.0Frontier
QwenQwen3 Vl 8b Instruct$0.0800$0.5000131.1KN/A
MetaLlama 4 Scout$0.0800$0.3000327.7KN/A
GoogleGemma 3 27b It$0.0800$0.1600131.1K
74.2Frontier
Bytedance-seedSeed 2.0 Mini$0.1000$0.4000262.1KN/A
MistralMinistral 3b 2512$0.1000$0.1000131.1KN/A
GoogleGemini 2.5 Flash Lite Preview 09 2025$0.1000$0.40001.0M
40.8Good
BytedanceUi Tars 1.5 7b$0.1000$0.2000128KN/A
GoogleGemini 2.5 Flash Lite$0.1000$0.40001.0M
40.9Good
OpenAIGPT 4.1 Nano$0.1000$0.40001.0M
66.3Frontier
GoogleGemini 2.0 Flash 001$0.1000$0.40001.0M
73.3Frontier
MistralPixtral 12b$0.1000$0.100032.8KN/A
QwenQwen3 Vl 32b Instruct$0.1040$0.4160131.1KN/A
QwenQwen3 Vl 8b Thinking$0.1170$1.36131.1KN/A
QwenQwen3 Vl 30b A3b Thinking$0.1300$1.56131.1KN/A
QwenQwen3 Vl 30b A3b Instruct$0.1300$0.5200131.1KN/A
QwenQwen Vl Plus$0.1365$0.4095131.1KN/A
BaiduErnie 4.5 Vl 28b A3b$0.1400$0.560030KN/A
MistralMistral Small 2603$0.1500$0.6000262.1K
72.7Frontier
MistralMinistral 8b 2512$0.1500$0.1500262.1K
51.2Strong
MetaLlama 4 Maverick$0.1500$0.60001.0MN/A
OpenAIGPT 4o Mini 2024 07 18$0.1500$0.6000128K
65.6Frontier
OpenAIGPT 4o Mini$0.1500$0.6000128K
65.6Frontier
QwenQwen3.5 35b A3b$0.1625$1.30262.1K
80.5Frontier
Arcee-aiSpotlight$0.1800$0.1800131.1KN/A
MetaLlama Guard 4 12b$0.1800$0.1800163.8KN/A
QwenQwen3.5 27b$0.1950$1.56262.1K
81.5Frontier
OpenAIGPT 5.4 Nano$0.2000$1.25400K
73.8Frontier
MistralMinistral 14b 2512$0.2000$0.2000262.1KN/A
xAIGrok 4.1 Fast$0.2000$0.50002M
62.5Strong
NVIDIANemotron Nano 12b V2 Vl$0.2000$0.6000131.1KN/A
QwenQwen3 Vl 235b A22b Instruct$0.2000$0.8800262.1K
79.6Frontier
xAIGrok 4 Fast$0.2000$0.50002M
63.1Strong
QwenQwen2.5 Vl 32b Instruct$0.2000$0.6000128KN/A
MinimaxMinimax 01$0.2000$1.101.0MN/A
QwenQwen 2.5 Vl 7b Instruct$0.2000$0.200032.8KN/A
Bytedance-seedSeed 2.0 Lite$0.2500$2.00262.1KN/A
GoogleGemini 3.1 Flash Lite Preview$0.2500$1.501.0M
62.2Strong
Bytedance-seedSeed 1.6$0.2500$2.00262.1KN/A
OpenAIGPT 5.1 Codex Mini$0.2500$2.00400K
46.1Good

Page 1 of 3

Deploy your AI agent for $33/mo flat.

Managed Telegram bot hosting. We handle the infrastructure.

Any model. Any channel. Zero infrastructure.