train_codealpacapy_456_1765235666

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 8.1253
  • Num Input Tokens Seen: 24973864

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5233 1.0 1908 0.4729 1246832
0.5393 2.0 3816 0.4684 2497936
0.4665 3.0 5724 0.4608 3743760
0.4402 4.0 7632 0.4560 4991472
1.0149 5.0 9540 0.4551 6239608
0.4209 6.0 11448 0.4515 7485248
0.6452 7.0 13356 0.4506 8733024
0.6306 8.0 15264 0.4495 9983720
0.367 9.0 17172 0.4497 11229792
0.4111 10.0 19080 0.4492 12476552
0.4192 11.0 20988 0.4500 13725560
0.453 12.0 22896 0.4507 14977976
0.4087 13.0 24804 0.4503 16225896
0.5087 14.0 26712 0.4510 17477224
0.4207 15.0 28620 0.4529 18726216
0.3203 16.0 30528 0.4537 19973408
0.3979 17.0 32436 0.4543 21226656
0.4305 18.0 34344 0.4551 22472696
0.4425 19.0 36252 0.4555 23722376
0.7746 20.0 38160 0.4553 24973864

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
80
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_456_1765235666

Adapter
(2098)
this model

Evaluation results