Hyperparameters
#2
by
princeton-nlp
- opened
Hi, can you provide the hyperparameters (lr, batch size, sequence length, etc.) used for this run? Thank you!
Hi, you can find the training script here: https://github.com/huggingface/smollm/blob/main/text/pretraining/continual-pretraining/finemath/160B-runs/fwedu-finemath-infiwebmath-4plus.yaml