UME-R1 Collection UME-R1 is a framework designed to endow multimodal embedding models with the flexibility to switch between discriminative and generative embeddings • 4 items • Updated 17 days ago • 8
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1 • 50
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning Paper • 2503.04812 • Published Mar 4 • 15
LLaVE Collection LLaVE is a series of large language and vision embedding models trained on a variety of multimodal embedding datasets • 4 items • Updated Mar 10 • 8