Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published 3 days ago • 36
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published 3 days ago • 187
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation Paper • 2602.18283 • Published 9 days ago • 53
UniVBench: Towards Unified Evaluation for Video Foundation Models Paper • 2602.21835 • Published 4 days ago • 2
Solaris: Building a Multiplayer Video World Model in Minecraft Paper • 2602.22208 • Published 4 days ago • 22
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation Paper • 2602.19163 • Published 7 days ago • 13
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning Paper • 2602.21534 • Published 4 days ago • 22
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 4 days ago • 46
MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models Paper • 2602.17602 • Published 10 days ago • 52
The Art of Efficient Reasoning: Data, Reward, and Optimization Paper • 2602.20945 • Published 5 days ago • 4
Implicit Intelligence -- Evaluating Agents on What Users Don't Say Paper • 2602.20424 • Published 5 days ago • 3
PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency Paper • 2602.16745 • Published 11 days ago • 7
See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis Paper • 2602.20951 • Published 5 days ago • 13
Test-Time Training with KV Binding Is Secretly Linear Attention Paper • 2602.21204 • Published 5 days ago • 28
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published 17 days ago • 51
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 5 days ago • 88