3 28 4

minghao

Liam-Liu

AI & ML interests

LLM, AD

Recent Activity

upvoted a paper about 1 month ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

upvoted a paper about 1 month ago

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

authored a paper about 2 months ago

OAgents: An Empirical Study of Building Effective Agents

View all activity

Organizations

upvoted 2 papers about 1 month ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16 • 47

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

Paper • 2510.11652 • Published Oct 13 • 28

upvoted 2 papers about 2 months ago

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

Paper • 2510.14763 • Published Oct 16 • 13

SimKO: Simple Pass@K Policy Optimization

Paper • 2510.14807 • Published Oct 16 • 10

upvoted 2 papers 2 months ago

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30 • 18

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

upvoted 8 papers 3 months ago

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

Paper • 2509.04292 • Published Sep 4 • 57

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 124

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25 • 345

upvoted 4 papers 4 months ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 129

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 192

Efficient Agents: Building Effective Agents While Reducing Cost

Paper • 2508.02694 • Published Jul 24 • 85

VeriGUI: Verifiable Long-Chain GUI Dataset

Paper • 2508.04026 • Published Aug 6 • 160

upvoted 2 papers 5 months ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 93

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

minghao

AI & ML interests

Recent Activity

Organizations

Liam-Liu's activity