A collection of mutiple benchmarks for large reasoning model evaluation
datasets-and-models
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
63
guanning-ai/SmolLM-Checkpoints-Final-0124
Updated
guanning-ai/SmolLM-Checkpoints-Final-0123
Updated
guanning-ai/SmolLM-maclaurin-baseline-T16
Updated
guanning-ai/SmolLM-pkpo-T16
Updated
guanning-ai/SmolLM-grpo-32rollouts
Updated
guanning-ai/SmolLM-p-normalization-32rollouts
Updated
guanning-ai/Smollm004
Updated
guanning-ai/Smollm002
Updated
guanning-ai/Smollm003
Updated
guanning-ai/Smollm001
Updated
datasets
138
guanning-ai/gsm8k-platinum
Viewer
•
Updated
•
1.21k
•
11
guanning-ai/math500_level5
Viewer
•
Updated
•
134
•
26
guanning-ai/math500_level4
Viewer
•
Updated
•
128
•
22
guanning-ai/math500_level3
Viewer
•
Updated
•
105
•
23
guanning-ai/math500_level2
Viewer
•
Updated
•
90
•
26
guanning-ai/math500_level1
Viewer
•
Updated
•
43
•
23
guanning-ai/minervamath
Viewer
•
Updated
•
272
•
13
guanning-ai/smollm-gsm8k-data-1024
Viewer
•
Updated
•
7.65M
•
83
guanning-ai/gsm8k-metamath
Viewer
•
Updated
•
160k
•
31
guanning-ai/gsm8k-mumath
Viewer
•
Updated
•
92k
•
24