On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 10 days ago • 224
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 22 days ago • 110
MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation Paper • 2605.27366 • Published 16 days ago • 27
Rethinking Memory as Continuously Evolving Connectivity Paper • 2605.28773 • Published 15 days ago • 34
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 15 days ago • 90
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published 20 days ago • 224
Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories Paper • 2606.03979 • Published 9 days ago • 29
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published 11 days ago • 35
Trust-Region Behavior Blending for On-Policy Distillation Paper • 2605.31159 • Published 13 days ago • 65
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published Mar 27 • 144
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published Mar 5 • 39
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 290
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 266
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published Feb 24 • 103
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 232