Add evaluation results for GPQA-Diamond, MMLU-Pro
#6
by
SaylorTwift
HF Staff
- opened
Evaluation Results
This PR adds evaluation results extracted from the Model Card.
**Benchmarks:**
- MMLU-Pro: 81.02
GPQA-Diamond: 74.43
**Files created:** - .eval_results/mmlu_pro.yaml.eval_results/gpqa_diamond.yaml
--- Extracted automatically using the [LLM-powered evaluation extractor](https://github.com/huggingface/community-evals).