alankar_classifier_v2_final
This model is a fine-tuned version of ai4bharat/IndicBERTv2-MLM-only on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0355
- F1 Micro: 0.9669
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 15
Training results
| Training Loss | Epoch | Step | Validation Loss | F1 Micro |
|---|---|---|---|---|
| No log | 1.0 | 48 | 0.2702 | 0.3394 |
| No log | 2.0 | 96 | 0.1215 | 0.9107 |
| No log | 3.0 | 144 | 0.0735 | 0.9298 |
| No log | 4.0 | 192 | 0.0592 | 0.9516 |
| No log | 5.0 | 240 | 0.0482 | 0.9577 |
| No log | 6.0 | 288 | 0.0418 | 0.9665 |
| No log | 7.0 | 336 | 0.0396 | 0.9611 |
| No log | 8.0 | 384 | 0.0379 | 0.9611 |
| No log | 9.0 | 432 | 0.0366 | 0.9640 |
| No log | 10.0 | 480 | 0.0364 | 0.9640 |
| 0.0927 | 11.0 | 528 | 0.0363 | 0.9611 |
| 0.0927 | 12.0 | 576 | 0.0360 | 0.9640 |
| 0.0927 | 13.0 | 624 | 0.0358 | 0.9669 |
| 0.0927 | 14.0 | 672 | 0.0355 | 0.9669 |
| 0.0927 | 15.0 | 720 | 0.0355 | 0.9669 |
Framework versions
- Transformers 4.53.0
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.2
Intended Use & Limitations
This model is intended for educational and literary analysis purposes, providing a programmatic way to identify potential alankars in a line of poetry.
Limitations:
- Anupras Bias: The model has a strong bias towards
Anupras(Alliteration), as its pattern is the simplest to detect. It may confidently predictAnupraseven when a more complex alankar is also present. - Weakest Classes: The model is weakest at distinguishing
Utprekshafrom other comparison-based alankars likeRoopak. It relies heavily on keywords (मानो,ज्यों, etc.) for high confidence. - Confidence Scores: The model often predicts multiple alankars. It is best to look at the relative confidence scores. A high-confidence primary prediction and a medium-confidence secondary prediction can indicate the presence of both.
- Modern Poetry: While trained on some modern examples, its expertise is stronger in classical poetry structures. It may be less accurate on highly colloquial or abstract modern ghazals.
How to Use
Basic Inference with Pipeline
The easiest way to use the model is with a pipeline.
!pip install transformers[torch] -q
from transformers import pipeline
# Load the model from the Hugging Face Hub
model_name = "sastarogers/alankar_classifier_v2_final"
alankar_detector = pipeline(
"text-classification",
model=model_name,
return_all_scores=True
)
# Test with a line of poetry
line = "काली घटा का घमंड घटा।"
predictions = alankar_detector(line)
print(predictions)
- Downloads last month
- -
Model tree for sastarogers/alankar_classifier_v2_final
Base model
ai4bharat/IndicBERTv2-MLM-only