LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 16 days ago • 132
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts Paper • 2601.03315 • Published 17 days ago • 6
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation Paper • 2512.24271 • Published 23 days ago • 61
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models Paper • 2509.25848 • Published Sep 30, 2025 • 80
NaviTrace: Evaluating Embodied Navigation of Vision-Language Models Paper • 2510.26909 • Published Oct 30, 2025 • 14