GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time Paper • 2510.03777 • Published Feb 14 • 2
Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models Paper • 2605.08472 • Published May 8 • 5
Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution Paper • 2604.03472 • Published Apr 28 • 2
Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution Paper • 2604.03472 • Published Apr 28 • 2