SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 17 days ago • 43
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 17 days ago • 43
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 18 days ago • 53
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 18 days ago • 53
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 18 days ago • 53
Geometry-Aware Rotary Position Embedding for Consistent Video World Model Paper • 2602.07854 • Published 22 days ago • 10
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published 28 days ago • 33
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published Oct 28, 2025 • 44
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95