google-research-datasets/conceptual_captions Viewer • Updated Jun 17, 2024 • 5.34M • 14.3k • 103
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning Paper • 2306.00103 • Published May 31, 2023 • 1