Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models Paper • 2602.02039 • Published 2 days ago • 4
Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models Paper • 2602.02039 • Published 2 days ago • 4
Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs Paper • 2601.11061 • Published 19 days ago • 7
An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift Paper • 2601.05882 • Published 26 days ago • 20