Sleeping
HealthBenchAdvancedDemo
🏢
Evaluate healthcare models using prompts and datasets
None defined yet.
Evaluate healthcare models using prompts and datasets
Evaluate healthcare models using prompts and datasets
Evaluate healthcare model responses with scoring
Chat with a helpful assistant
Evaluate language models with T-Test and plot results