lmms-lab/LLaVA-OneVision-1.5-4B-Instruct
Image-Text-to-Text
β’
5B
β’
Updated
β’
2.72k
β’
16
Feeling and building the multimodal intelligence.
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe