Submitted by
yiyexy
AI & ML interests
Feeling and building the multimodal intelligence.
Recent Activity
View all activity
Papers
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling