DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published 11 days ago • 78
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition Paper • 2602.08439 • Published 15 days ago • 28
GenArena: How Can We Achieve Human-Aligned Evaluation for Visual Generation Tasks? Paper • 2602.06013 • Published 18 days ago
Unified Personalized Reward Model for Vision Generation Paper • 2602.02380 • Published 21 days ago • 20
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 21 days ago • 76
SS4D: Native 4D Generative Model via Structured Spacetime Latents Paper • 2512.14284 • Published Dec 16, 2025 • 14
EtCon: Edit-then-Consolidate for Reliable Knowledge Editing Paper • 2512.04753 • Published Dec 4, 2025 • 8
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published Dec 4, 2025 • 50
Think Visually, Reason Textually: Vision-Language Synergy in ARC Paper • 2511.15703 • Published Nov 19, 2025 • 9
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation Paper • 2112.02244 • Published Dec 4, 2021
UniREditBench: A Unified Reasoning-based Image Editing Benchmark Paper • 2511.01295 • Published Nov 3, 2025 • 39
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning Paper • 2510.27606 • Published Oct 31, 2025 • 30
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment Paper • 2510.10201 • Published Oct 11, 2025 • 36
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published 11 days ago • 78
Unified Personalized Reward Model for Vision Generation Paper • 2602.02380 • Published 21 days ago • 20
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published Dec 4, 2025 • 50
LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation Paper • 2510.11063 • Published Oct 13, 2025 • 1
Think Visually, Reason Textually: Vision-Language Synergy in ARC Paper • 2511.15703 • Published Nov 19, 2025 • 9