view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +6 about 10 hours ago • 3
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent +2 Apr 22, 2024 • 81
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 Jan 18, 2024 • 79