upload custom README
Browse files
README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- mercelisw/electra-grc
|
| 4 |
-
library_name: transformers
|
| 5 |
---
|
| 6 |
|
| 7 |
# Model Card for Model ID
|
|
@@ -13,13 +13,13 @@ This model is part of a series of models trained for the ML4AL paper “Gotta ca
|
|
| 13 |
### Model Description
|
| 14 |
|
| 15 |
- **Developed by:** Marijke Beersmans & Alek Keersmaekers
|
| 16 |
-
- **Model type:** ElectraForTokenClassification, finetuned for NER (
|
| 17 |
-
- **Language(s) (NLP):** Ancient Greek (
|
| 18 |
- **Finetuned from model:** mercelisw/electra-grc
|
| 19 |
|
| 20 |
### Model Sources
|
| 21 |
|
| 22 |
-
- **Repository:** [NERAncientGreekML4AL GitHub](https://github.com/NER-AncientLanguages/NERAncientGreekML4AL.git)(for data and training scripts)
|
| 23 |
- **Paper:** [ML4AL paper](https://aclanthology.org/2024.ml4al-1.16/)
|
| 24 |
|
| 25 |
## Training Details
|
|
@@ -40,31 +40,27 @@ We thank the following projects for providing the training data:
|
|
| 40 |
We use Weights & Biases for hyperparameter optimization with a random search strategy (10 folds), aiming to maximize the evaluation F1 score (eval_f1).
|
| 41 |
|
| 42 |
The search space includes:
|
| 43 |
-
- Learning Rate: Sampled uniformly between 1e-6 and 1e-4
|
| 44 |
-
- Weight Decay: One of [0.1, 0.01, 0.001]
|
| 45 |
-
- Number of Training Epochs: One of [3, 4, 5, 6]
|
| 46 |
|
| 47 |
For the final training of this model, the hyperparameters were:
|
| 48 |
-
- Learning Rate: 9.889410158465026e-05
|
| 49 |
-
- Weight Decay: 0.1
|
| 50 |
-
- Number of Training Epochs: 5
|
| 51 |
-
|
| 52 |
|
| 53 |
|
| 54 |
## Evaluation
|
| 55 |
|
| 56 |
-
This models was
|
| 57 |
-
|
| 58 |
-
| | precision | recall | f1-score | support |
|
| 59 |
|:-------------|------------:|---------:|-----------:|----------:|
|
| 60 |
-
| GRP |
|
| 61 |
-
| LOC |
|
| 62 |
-
| PERS |
|
| 63 |
-
| micro avg |
|
| 64 |
-
| macro avg |
|
| 65 |
-
| weighted avg |
|
| 66 |
-
|
| 67 |
-
|
| 68 |
|
| 69 |
|
| 70 |
If you use this work, please cite the following paper:
|
|
@@ -87,5 +83,4 @@ Beersmans, M., Keersmaekers, A., de Graaf, E., Van de Cruys, T., Depauw, M., & F
|
|
| 87 |
year = {2024},
|
| 88 |
month = aug,
|
| 89 |
pages = {152--164}
|
| 90 |
-
}
|
| 91 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
library_name: transformers
|
| 3 |
base_model:
|
| 4 |
- mercelisw/electra-grc
|
|
|
|
| 5 |
---
|
| 6 |
|
| 7 |
# Model Card for Model ID
|
|
|
|
| 13 |
### Model Description
|
| 14 |
|
| 15 |
- **Developed by:** Marijke Beersmans & Alek Keersmaekers
|
| 16 |
+
- **Model type:** ElectraForTokenClassification, finetuned for NER (PERS, LOC, GRP).
|
| 17 |
+
- **Language(s) (NLP):** Ancient Greek (greek_glaux normalization)
|
| 18 |
- **Finetuned from model:** mercelisw/electra-grc
|
| 19 |
|
| 20 |
### Model Sources
|
| 21 |
|
| 22 |
+
- **Repository:** [NERAncientGreekML4AL GitHub](https://github.com/NER-AncientLanguages/NERAncientGreekML4AL.git) (for data and training scripts)
|
| 23 |
- **Paper:** [ML4AL paper](https://aclanthology.org/2024.ml4al-1.16/)
|
| 24 |
|
| 25 |
## Training Details
|
|
|
|
| 40 |
We use Weights & Biases for hyperparameter optimization with a random search strategy (10 folds), aiming to maximize the evaluation F1 score (eval_f1).
|
| 41 |
|
| 42 |
The search space includes:
|
| 43 |
+
- Learning Rate: Sampled uniformly between 1e-6 and 1e-4
|
| 44 |
+
- Weight Decay: One of [0.1, 0.01, 0.001]
|
| 45 |
+
- Number of Training Epochs: One of [3, 4, 5, 6]
|
| 46 |
|
| 47 |
For the final training of this model, the hyperparameters were:
|
| 48 |
+
- Learning Rate: 9.889410158465026e-05
|
| 49 |
+
- Weight Decay: 0.1
|
| 50 |
+
- Number of Training Epochs: 5
|
|
|
|
| 51 |
|
| 52 |
|
| 53 |
## Evaluation
|
| 54 |
|
| 55 |
+
This models was evaluated on precision, recall and macro-f1 for its entity classes. See the paper for more information.
|
| 56 |
+
| Label | precision | recall | f1-score | support |
|
|
|
|
| 57 |
|:-------------|------------:|---------:|-----------:|----------:|
|
| 58 |
+
| GRP | 0.8054 | 0.8013 | 0.8033 | 1384 |
|
| 59 |
+
| LOC | 0.7379 | 0.6905 | 0.7134 | 1105 |
|
| 60 |
+
| PERS | 0.853 | 0.866 | 0.8595 | 3090 |
|
| 61 |
+
| micro avg | 0.8198 | 0.8152 | 0.8175 | 5579 |
|
| 62 |
+
| macro avg | 0.7988 | 0.7859 | 0.7921 | 5579 |
|
| 63 |
+
| weighted avg | 0.8184 | 0.8152 | 0.8166 | 5579 |
|
|
|
|
|
|
|
| 64 |
|
| 65 |
|
| 66 |
If you use this work, please cite the following paper:
|
|
|
|
| 83 |
year = {2024},
|
| 84 |
month = aug,
|
| 85 |
pages = {152--164}
|
| 86 |
+
}
|
|
|