|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
datasets: |
|
|
- openai/gsm8k |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-Math-1.5B |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
tags: |
|
|
- math |
|
|
- qwen |
|
|
- lora |
|
|
- mathematics |
|
|
- gsm8k |
|
|
--- |
|
|
|
|
|
# OpenMath |
|
|
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning |
|
|
|
|
|
## Overview |
|
|
OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training. |
|
|
|
|
|
This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT. |
|
|
|
|
|
The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face. |
|
|
|
|
|
--- |
|
|
|
|
|
## Base Model |
|
|
**Qwen/Qwen2.5-Math-1.5B** |
|
|
|
|
|
This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter. |
|
|
|
|
|
--- |
|
|
|
|
|
## Hardware Used (Latest Training Run) |
|
|
|
|
|
- **GPU:** AMD MI300X (ROCm 7.0) |
|
|
- **VRAM:** 192 GB |
|
|
- **OS:** Ubuntu 24.04 |
|
|
- **Framework:** PyTorch + Hugging Face |
|
|
- **Backend:** ROCm |
|
|
|
|
|
--- |
|
|
|
|
|
## Dataset |
|
|
|
|
|
**GSM8K (Grade School Math 8K)** |
|
|
- **Training samples:** 7,473 (full training split) |
|
|
- **Evaluation:** Full GSM8K test split (1,319 problems) |
|
|
|
|
|
Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
**Method:** LoRA (full precision, bfloat16) |
|
|
**Precision:** bfloat16 (no 4-bit quantization in this run) |
|
|
|
|
|
### LoRA settings |
|
|
- Rank: 16 |
|
|
- Alpha: 32 |
|
|
- Dropout: 0.05 |
|
|
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj` |
|
|
|
|
|
### Data & sequence |
|
|
- Max sequence length: 1024 |
|
|
|
|
|
### Optimization |
|
|
- Per-device batch size: 2 |
|
|
- Gradient accumulation: 8 |
|
|
- Effective batch size: 16 |
|
|
- Learning rate: 1e-4 |
|
|
- Optimizer: `adamw_torch` |
|
|
- Scheduler: cosine |
|
|
- Warmup: 5% |
|
|
|
|
|
### Training |
|
|
- **Epochs:** 3 |
|
|
|
|
|
--- |
|
|
|
|
|
## Results |
|
|
|
|
|
**GSM8K Accuracy (Full Test Set):** |
|
|
750 / 1319 = **56.86% accuracy** |
|
|
|
|
|
This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use This Model |
|
|
|
|
|
1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face. |
|
|
2. Attach this LoRA adapter using PEFT. |
|
|
3. Use a structured prompt that includes an instruction, problem, and solution section for best results. |
|
|
|
|
|
--- |
|
|
|
|
|
## Why This Matters |
|
|
|
|
|
- Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA. |
|
|
- Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning. |
|
|
- Provides a compact adapter instead of requiring users to download a massive full model. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- The model can make reasoning mistakes. |
|
|
- It should not be used for exams, assignments, or professional decisions. |
|
|
- Performance depends heavily on prompt formatting. |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
**cc-by-nc-4.0** |
|
|
|