Whisper small-baseline kh - Sethisak San

This model is a fine-tuned version of openai/whisper-small on the KhmerAsrDataset dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2433
  • Wer: 96.2162
  • Cer: 28.7772

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 25
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.5448 1.0 78 1.5003 100.0 137.4657
1.3444 2.0 156 1.3168 100.0 114.8952
1.2049 3.0 234 1.1834 100.0 100.3741
1.0807 4.0 312 1.0514 100.0 100.7715
0.8051 5.0 390 0.6328 100.0 71.2413
0.4002 6.0 468 0.3343 100.0 45.9660
0.233 7.0 546 0.2361 99.2793 39.8592
0.1433 8.0 624 0.1929 98.7387 34.2659
0.0852 9.0 702 0.1712 98.5586 32.4906
0.0542 10.0 780 0.1727 97.4775 31.2846
0.0306 11.0 858 0.1784 97.6577 30.4294
0.0187 12.0 936 0.1918 97.2973 30.3388
0.011 13.0 1014 0.2040 97.1171 30.3527
0.0077 14.0 1092 0.2102 96.3964 30.2784
0.0052 15.0 1170 0.2110 96.3964 29.5534
0.0036 16.0 1248 0.2198 96.0360 29.0770
0.0018 17.0 1326 0.2267 96.3964 28.9841
0.0013 18.0 1404 0.2316 96.9369 28.7842
0.0009 19.0 1482 0.2333 95.6757 28.4775
0.0006 20.0 1560 0.2368 96.5766 28.8981
0.0005 21.0 1638 0.2388 96.2162 28.7494
0.0004 22.0 1716 0.2410 96.0360 28.6680
0.0005 23.0 1794 0.2421 96.3964 28.7447
0.0004 24.0 1872 0.2428 96.3964 28.7819
0.0004 25.0 1950 0.2433 96.2162 28.7772

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.8.0+cu126
  • Datasets 2.14.7
  • Tokenizers 0.21.4
Downloads last month
4
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for S-Sethisak/whisper-small-kh-baseline

Finetuned
(3164)
this model

Dataset used to train S-Sethisak/whisper-small-kh-baseline

Evaluation results