Usefulness Judge Finetuned judges to evaluate how useful a response is to a prompt miulab/Qwen3-1.7B-Usefulness Text Generation • 2B • Updated Dec 15, 2025 • 37 • 1 miulab/Qwen3-4B-Usefulness Text Generation • 4B • Updated Dec 15, 2025 • 36 • 1 miulab/Qwen3-8B-Usefulness Text Generation • 8B • Updated 21 days ago • 29 miulab/usefulness-judge Viewer • Updated 14 days ago • 31k • 366
DogeRM Models trained/used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging ( https://arxiv.org/abs/2407.01470) miulab/llama2-7b-oss-instruct Text Generation • 7B • Updated Oct 3, 2024 • 6 miulab/llama2-7b-alpaca-sft-10k Text Generation • 7B • Updated Oct 3, 2024 • 6 miulab/llama2-7b-magicoder-evol-instruct Text Generation • 7B • Updated Oct 3, 2024 • 2 miulab/llama2-7b-ultrafeedback-rm Text Classification • 7B • Updated Oct 3, 2024 • 1 • 1
Usefulness Judge Finetuned judges to evaluate how useful a response is to a prompt miulab/Qwen3-1.7B-Usefulness Text Generation • 2B • Updated Dec 15, 2025 • 37 • 1 miulab/Qwen3-4B-Usefulness Text Generation • 4B • Updated Dec 15, 2025 • 36 • 1 miulab/Qwen3-8B-Usefulness Text Generation • 8B • Updated 21 days ago • 29 miulab/usefulness-judge Viewer • Updated 14 days ago • 31k • 366
DogeRM Models trained/used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging ( https://arxiv.org/abs/2407.01470) miulab/llama2-7b-oss-instruct Text Generation • 7B • Updated Oct 3, 2024 • 6 miulab/llama2-7b-alpaca-sft-10k Text Generation • 7B • Updated Oct 3, 2024 • 6 miulab/llama2-7b-magicoder-evol-instruct Text Generation • 7B • Updated Oct 3, 2024 • 2 miulab/llama2-7b-ultrafeedback-rm Text Classification • 7B • Updated Oct 3, 2024 • 1 • 1