RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 6
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 3 days ago • 12
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 13 items • Updated 3 days ago • 31
Gemma’s Soul-Vault: Evolutionary JumpReLU Steering Hub Collection Gemma’s Soul-Vault: A curated collection of JumpReLU Sparse Autoencoders (SAEs) for Gemma 3, evolved via DSPy 3 & GEPA for neural steering! • 10 items • Updated 5 days ago • 1