jian's picture

jian

lipliu

·

cquliujian

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

upvoted a paper 5 days ago

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

upvoted a paper 6 days ago

Code as Agent Harness

View all activity

Organizations

None yet

upvoted a paper 2 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Paper • 2606.02437 • Published 10 days ago • 224

upvoted a paper 5 days ago

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published 22 days ago • 110

upvoted 11 papers 6 days ago

Code as Agent Harness

Paper • 2605.18747 • Published 24 days ago • 215

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Paper • 2605.27366 • Published 16 days ago • 27

Rethinking Memory as Continuously Evolving Connectivity

Paper • 2605.28773 • Published 15 days ago • 34

Agent Explorative Policy Optimization for Multimodal Agentic Reasoning

Paper • 2605.28774 • Published 15 days ago • 90

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published 20 days ago • 224

ESPO: Early-Stopping Proximal Policy Optimization

Paper • 2605.29860 • Published 14 days ago • 19

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

Paper • 2606.03979 • Published 9 days ago • 29

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

Paper • 2606.01311 • Published 11 days ago • 35

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published 11 days ago • 42

Trust-Region Behavior Blending for On-Policy Distillation

Paper • 2605.31159 • Published 13 days ago • 65

Self-Distilled Policy Gradient

Paper • 2606.04036 • Published 9 days ago • 24

upvoted a paper 2 months ago

TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published Mar 27 • 144

upvoted 4 papers 3 months ago

Reasoning Models Struggle to Control their Chains of Thought

Paper • 2603.05706 • Published Mar 5 • 39

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 290

iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published Feb 9 • 19

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266

upvoted a paper 4 months ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published Feb 24 • 103

upvoted a paper 5 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 232