QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models Paper • 2602.20309 • Published 11 days ago • 16 • 4
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video Paper • 2603.04291 • Published 2 days ago • 11
view article Article What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware Aug 8, 2025 • 33
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models Paper • 2602.20309 • Published 11 days ago • 16
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models Paper • 2602.20309 • Published 11 days ago • 16
Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published 26 days ago • 42
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs Paper • 2512.14698 • Published Dec 16, 2025 • 21
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation Paper • 2508.03485 • Published Aug 5, 2025 • 2
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation Paper • 2508.03485 • Published Aug 5, 2025 • 2
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model Paper • 2510.19871 • Published Oct 22, 2025 • 30
Video-As-Prompt Collection The model zoo for "Video-As-Prompt: Unified Semantic Control for Video Generation" • 3 items • Updated Oct 27, 2025 • 13
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published Oct 23, 2025 • 50
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10, 2025 • 128
AudioStory: Generating Long-Form Narrative Audio with Large Language Models Paper • 2508.20088 • Published Aug 27, 2025 • 21