view article Article Multi-Label Classification Model From Scratch: Step-by-Step Tutorial Jan 8, 2024 • 49
GLiNER-PII Collection PII detection models developed in collaboration with Wordcab • 5 items • Updated Sep 24 • 21
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations Paper • 2509.03405 • Published Sep 3 • 23
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9 • 49
GLiClass: Generalist Lightweight Model for Sequence Classification Tasks Paper • 2508.07662 • Published Aug 11 • 9
GLiClass ONNX Collection GLiClass models converted to ONNX format, as well as 8bit quantization • 5 items • Updated Jul 29 • 4
GLiCLass-V3 Collection Models for zero-shot text classification that are up to 50 times faster than Cross-Encoders and show the same or higher accuracy. • 8 items • Updated Aug 13 • 18
GLiNER-X Collection The Multilingual Named Entity Recognition (NER) model which is capable of identifying any entity type. • 6 items • Updated Jun 24 • 21
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26 • 176
GLiNER-biomed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition Paper • 2504.00676 • Published Apr 1 • 5
GLiNER-BioMed Collection Collection of high-quality GLiNER models tuned for working with biomedical data • 7 items • Updated Apr 2 • 7
view article Article Introducing EuroBERT: A High-Performance Multilingual Encoder Model Mar 10 • 146
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 221