alankar_classifier_v2_final

This model is a fine-tuned version of ai4bharat/IndicBERTv2-MLM-only on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0355
  • F1 Micro: 0.9669

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss F1 Micro
No log 1.0 48 0.2702 0.3394
No log 2.0 96 0.1215 0.9107
No log 3.0 144 0.0735 0.9298
No log 4.0 192 0.0592 0.9516
No log 5.0 240 0.0482 0.9577
No log 6.0 288 0.0418 0.9665
No log 7.0 336 0.0396 0.9611
No log 8.0 384 0.0379 0.9611
No log 9.0 432 0.0366 0.9640
No log 10.0 480 0.0364 0.9640
0.0927 11.0 528 0.0363 0.9611
0.0927 12.0 576 0.0360 0.9640
0.0927 13.0 624 0.0358 0.9669
0.0927 14.0 672 0.0355 0.9669
0.0927 15.0 720 0.0355 0.9669

Framework versions

  • Transformers 4.53.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.14.4
  • Tokenizers 0.21.2

Intended Use & Limitations

This model is intended for educational and literary analysis purposes, providing a programmatic way to identify potential alankars in a line of poetry.

Limitations:

  • Anupras Bias: The model has a strong bias towards Anupras (Alliteration), as its pattern is the simplest to detect. It may confidently predict Anupras even when a more complex alankar is also present.
  • Weakest Classes: The model is weakest at distinguishing Utpreksha from other comparison-based alankars like Roopak. It relies heavily on keywords (मानो, ज्यों, etc.) for high confidence.
  • Confidence Scores: The model often predicts multiple alankars. It is best to look at the relative confidence scores. A high-confidence primary prediction and a medium-confidence secondary prediction can indicate the presence of both.
  • Modern Poetry: While trained on some modern examples, its expertise is stronger in classical poetry structures. It may be less accurate on highly colloquial or abstract modern ghazals.

How to Use

Basic Inference with Pipeline

The easiest way to use the model is with a pipeline.

!pip install transformers[torch] -q

from transformers import pipeline

# Load the model from the Hugging Face Hub
model_name = "sastarogers/alankar_classifier_v2_final"
alankar_detector = pipeline(
    "text-classification",
    model=model_name,
    return_all_scores=True
)

# Test with a line of poetry
line = "काली घटा का घमंड घटा।"
predictions = alankar_detector(line)
print(predictions)
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sastarogers/alankar_classifier_v2_final

Finetuned
(13)
this model

Evaluation results