Yaorui SHI

yrshi

syr-cn

AI & ML interests

None yet

Recent Activity

upvoted a paper about 6 hours ago

On the Geometry of On-Policy Distillation

upvoted a paper about 6 hours ago

Agents' Last Exam

upvoted a paper 3 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

View all activity

Organizations

upvoted 2 papers about 6 hours ago

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 6 days ago • 61

Agents' Last Exam

Paper • 2606.05405 • Published 8 days ago • 155

upvoted a paper 3 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 8 days ago • 21

upvoted a paper 13 days ago

V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts

Paper • 2603.10848 • Published Mar 11 • 16

upvoted a paper 14 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Paper • 2605.27141 • Published 16 days ago • 19

upvoted a paper 15 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published 17 days ago • 33

upvoted a paper 16 days ago

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Paper • 2506.03610 • Published Jun 4, 2025 • 10

upvoted a paper 17 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 22 days ago • 204

upvoted a paper 19 days ago

SOD: Step-wise On-policy Distillation for Small Language Model Agents

Paper • 2605.07725 • Published May 8 • 25

upvoted 2 papers 21 days ago

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published 23 days ago • 58

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 293

upvoted 2 collections 21 days ago

Agent

Collection

122 items • Updated 1 day ago • 13

Papers

Collection

1 item • Updated May 9 • 1

upvoted a paper 26 days ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 28 days ago • 111

upvoted a paper 27 days ago

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Paper • 2605.13831 • Published 29 days ago • 87

authored 3 papers 28 days ago

upvoted 2 papers 29 days ago

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

Paper • 2605.08354 • Published May 8 • 23

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published May 8 • 41

Yaorui SHI

AI & ML interests

Recent Activity

Organizations

yrshi's activity