-
Scaling Language-Centric Omnimodal Representation Learning
Paper • 2510.11693 • Published • 102 -
LCO-Embedding/LCO-Embedding-Omni-7B
Feature Extraction • 9B • Updated • 153 • 4 -
LCO-Embedding/LCO-Embedding-Omni-3B
Feature Extraction • 5B • Updated • 2.37k • 2 -
LCO-Embedding/SeaDoc
Viewer • Updated • 12.2k • 199 • 2
LCO-Embedding
community
AI & ML interests
None defined yet.
Recent Activity
Organization Card
Welcome to the LCO-Embedding project - Scaling Language-centric Omnimodal Representation Learning.
Highlights:
- We introduce LCO-Embedding, a language-centric omnimodal representation learning method and the LCO-Embedding model families, setting a new state-of-the-art on MIEB (Massive Image Embedding Benchmark) while supporting audio and videos.
- We introduce the Generation-Representation Scaling Law, and connect models' generative capabilities and their representation upper bound.
- We introduce SeaDoc, a challenging visual document retrieval task in Southeast Asian languages, and show that continual generative pretraining before contrastive learning raises the representation upper bound.
-
Scaling Language-Centric Omnimodal Representation Learning
Paper • 2510.11693 • Published • 102 -
LCO-Embedding/LCO-Embedding-Omni-7B
Feature Extraction • 9B • Updated • 153 • 4 -
LCO-Embedding/LCO-Embedding-Omni-3B
Feature Extraction • 5B • Updated • 2.37k • 2 -
LCO-Embedding/SeaDoc
Viewer • Updated • 12.2k • 199 • 2