Ministral-3-14B-writer
LoRA fine-tune of Ministral 3 14B for fiction writing.
Training Data
| Dataset | Samples | Words | Vocabulary |
|---|---|---|---|
| Primary | 68,119 | 808M | 738K |
| Secondary | 12,635 | 148M | 194K |
| Total | 80,754 | 956M | β |
- ~15k tokens average per sample
- Dialogue-heavy prose (~87-89%)
- Mean sentence length: 14-16 words
- Mixed first/third person POV
- No sample packing
Training
- 4ΓH100 80GB
- LoRA rank 512, alpha 512 (rsLoRA)
- 16k context
- BF16 base, FP32 Adam
- ~34 hours
| Metric | Value |
|---|---|
| Steps | 2100 / 2500 |
| Learning rate | 1e-5 β 1e-6 (cosine) |
| Train loss | 2.42 β 2.16 |
| Eval loss | 2.26 β 2.02 |
| Grad norm | 3.2 avg (clipped at 10) |
Aborted at step 2100 β val loss plateaued due to overly aggressive LR decay.
See main_dataset_analysis.md and secondary_dataset_analysis.md for detailed statistics.
Trained with ministral3-fsdp-lora-loop. Dataset analysis via dataset-analyzer.
- Downloads last month
- 39
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support