Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Main
Tasks
1
Libraries
Languages
Licenses
Other
Tasks
Reset Tasks
Text Generation
Any-to-Any
Image-Text-to-Text
Image-to-Text
Image-to-Image
Text-to-Image
Text-to-Video
Text-to-Speech
+ 44
Parameters
Reset Parameters
< 1B
6B
12B
32B
128B
> 500B
< 1B
> 500B
Libraries
PyTorch
google-tensorflow
TensorFlow
JAX
Transformers
Diffusers
sentence-transformers
Safetensors
ONNX
GGUF
Transformers.js
MLX
+ 41
Apps
vLLM
llama.cpp
MLX LM
LM Studio
Ollama
Jan
Draw Things
+ 7
Inference Providers
Groq
Novita
Cerebras
SambaNova
Nscale
fal
Hyperbolic
Together AI
+ 10
Apply filters
Models
7,161
Full-text search
Inference Available
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
moonshotai/Kimi-K2.5
Image-Text-to-Text
•
171B
•
Updated
1 day ago
•
274k
•
•
1.75k
PaddlePaddle/PaddleOCR-VL-1.5
Image-Text-to-Text
•
1.0B
•
Updated
7 days ago
•
7.2k
•
349
deepseek-ai/DeepSeek-OCR-2
Image-Text-to-Text
•
3B
•
Updated
3 days ago
•
458k
•
696
internlm/Intern-S1-Pro
Image-Text-to-Text
•
Updated
1 day ago
•
3.57k
•
157
lightonai/LightOnOCR-2-1B
Image-Text-to-Text
•
1B
•
Updated
4 days ago
•
150k
•
494
tencent/Youtu-VL-4B-Instruct
Image-Text-to-Text
•
5B
•
Updated
about 17 hours ago
•
3.43k
•
139
Qwen/Qwen3-VL-8B-Instruct
Image-Text-to-Text
•
9B
•
Updated
Oct 15, 2025
•
2.49M
•
•
729
trillionlabs/gWorld-8B
Image-Text-to-Text
•
9B
•
Updated
2 days ago
•
106
•
28
google/medgemma-1.5-4b-it
Image-Text-to-Text
•
Updated
13 days ago
•
144k
•
413
google/translategemma-4b-it
Image-Text-to-Text
•
Updated
9 days ago
•
99.3k
•
595
google/gemma-3-4b-it
Image-Text-to-Text
•
Updated
Mar 21, 2025
•
996k
•
1.16k
Hcompany/Holo2-235B-A22B
Image-Text-to-Text
•
236B
•
Updated
3 days ago
•
108
•
20
tencent/Youtu-VL-4B-Instruct-GGUF
Image-Text-to-Text
•
5B
•
Updated
1 day ago
•
3.44k
•
55
trillionlabs/gWorld-32B
Image-Text-to-Text
•
33B
•
Updated
2 days ago
•
183
•
18
google/gemma-3-27b-it
Image-Text-to-Text
•
Updated
Mar 21, 2025
•
1.62M
•
•
1.85k
stepfun-ai/Step3-VL-10B
Image-Text-to-Text
•
10B
•
Updated
2 days ago
•
78k
•
382
nvidia/Cosmos-Reason2-8B
Image-Text-to-Text
•
9B
•
Updated
7 days ago
•
127k
•
104
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text
•
3B
•
Updated
Nov 4, 2025
•
3.02M
•
3.13k
google/medgemma-4b-it
Image-Text-to-Text
•
Updated
Oct 28, 2025
•
370k
•
877
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text
•
1.0B
•
Updated
1 day ago
•
16.1k
•
1.54k
google/translategemma-27b-it
Image-Text-to-Text
•
Updated
9 days ago
•
36.9k
•
294
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
8B
•
Updated
Apr 6, 2025
•
3.39M
•
•
1.45k
ByteDance-Seed/UI-TARS-1.5-7B
Image-Text-to-Text
•
8B
•
Updated
Apr 18, 2025
•
67k
•
507
zai-org/GLM-4.6V-Flash
Image-Text-to-Text
•
10B
•
Updated
Dec 9, 2025
•
29.5k
•
•
573
google/translategemma-12b-it
Image-Text-to-Text
•
Updated
9 days ago
•
154k
•
244
Qwen/Qwen2.5-VL-3B-Instruct
Image-Text-to-Text
•
4B
•
Updated
Apr 6, 2025
•
21.6M
•
602
ibm-granite/granite-docling-258M
Image-Text-to-Text
•
0.3B
•
Updated
Sep 23, 2025
•
219k
•
1.11k
Qwen/Qwen3-VL-4B-Instruct
Image-Text-to-Text
•
4B
•
Updated
Oct 15, 2025
•
749k
•
325
tencent/Youtu-Parsing
Image-Text-to-Text
•
3B
•
Updated
8 days ago
•
115
•
36
moonshotai/Kimi-VL-A3B-Thinking-2506
Image-Text-to-Text
•
16B
•
Updated
7 days ago
•
125k
•
346
Previous
1
2
3
...
100
Next