Andrew Zhao
andrewzh
AI & ML interests
Reinforcement Learning, Agents
Recent Activity
upvoted a paper 43 minutes ago
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers upvoted a paper about 1 month ago
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning authored
a paper
about 2 months ago
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective
Reinforcement Learning for LLM Reasoning