UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11 • 37
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11 • 37
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11 • 37 • 2
Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding Paper • 2509.11866 • Published Sep 15 • 1
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models Paper • 2508.12081 • Published Aug 16
Leaderboards and Benchmarks for MLLM Collection Leaderboards and Benchmarks for MLLM • 4 items • Updated May 9 • 1