MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 3 days ago • 109
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published 3 days ago • 51
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 3 days ago • 288
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 4 days ago • 165
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering? Paper • 2603.15401 • Published 4 days ago • 15
In-Context Reinforcement Learning for Tool Use in Large Language Models Paper • 2603.08068 • Published 12 days ago • 39
Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published 13 days ago • 17
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published 16 days ago • 55
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? Paper • 2603.03194 • Published 17 days ago • 56
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 22 days ago • 88
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Paper • 2602.10693 • Published Feb 11 • 220
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published Feb 13 • 34
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training Paper • 2602.03411 • Published Feb 3 • 38
SWE-World: Building Software Engineering Agents in Docker-Free Environments Paper • 2602.03419 • Published Feb 3 • 40
RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents Paper • 2602.02486 • Published Feb 2 • 19