MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Paper • 2603.09206 • Published 5 days ago • 42
LiveMedBench: A Contamination-Free Medical Benchmark for LLMs with Automated Rubric Evaluation Paper • 2602.10367 • Published Feb 10 • 13
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published Jul 15, 2025 • 64