Ministral-3-14B-writer

LoRA fine-tune of Ministral 3 14B for fiction writing.

Training Data

Dataset Samples Words Vocabulary
Primary 68,119 808M 738K
Secondary 12,635 148M 194K
Total 80,754 956M β€”
  • ~15k tokens average per sample
  • Dialogue-heavy prose (~87-89%)
  • Mean sentence length: 14-16 words
  • Mixed first/third person POV
  • No sample packing

Training

  • 4Γ—H100 80GB
  • LoRA rank 512, alpha 512 (rsLoRA)
  • 16k context
  • BF16 base, FP32 Adam
  • ~34 hours
Metric Value
Steps 2100 / 2500
Learning rate 1e-5 β†’ 1e-6 (cosine)
Train loss 2.42 β†’ 2.16
Eval loss 2.26 β†’ 2.02
Grad norm 3.2 avg (clipped at 10)

Aborted at step 2100 β€” val loss plateaued due to overly aggressive LR decay.

See main_dataset_analysis.md and secondary_dataset_analysis.md for detailed statistics.

Trained with ministral3-fsdp-lora-loop. Dataset analysis via dataset-analyzer.

Downloads last month
39
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thestarfarer/Ministral-3-14B-writer

Adapter
(1)
this model
Adapters
1 model