医疗领域后训练模型:sft、reward model、grpo
ZuochengYing
whaleL
AI & ML interests
None yet
Recent Activity
updated
a model
2 days ago
whaleL/rlhf
updated
a model
2 days ago
whaleL/checkpoint
published
a model
2 days ago
whaleL/checkpoint
Organizations
None yet