Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 25 days ago • 50
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published 9 days ago • 17
Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values Paper • 2510.20187 • Published Oct 23 • 18
TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy Paper • 2506.11302 • Published Jun 12 • 3
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation Paper • 2509.15194 • Published Sep 18 • 33