XLM-RoBERTa OVOS intent classifier (base-sized model)

XLM-RoBERTa model pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al. and first released in this repository.

This model was fine-tuned to classify intents based on the dataset Jarbas/ovos_intents_train

Intended uses & limitations

You can use the raw model for intent classification in the Open Voice OS project context.

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
model = AutoModelForSequenceClassification.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
tokenizer = AutoTokenizer.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")
config = AutoConfig.from_pretrained("fdemelo/xlm-roberta-ovos-intent-classifier")

# preprocess dataset
def tokenize_function(examples):
examples["label"] = list(map(lambda x: config.label2id[x], examples["label"]))
return tokenizer(examples["sentence"], padding="max_length", truncation=True)

tokenized_dataset = dataset.map(tokenize_function, batched=True)
prediction = model.predict(tokenized_dataset)

Downloads last month: 7

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fdemelo/xlm-roberta-ovos-intent-classifier

Base model

FacebookAI/xlm-roberta-base

Finetuned

(3708)

this model

Finetunes

2 models

Dataset used to train fdemelo/xlm-roberta-ovos-intent-classifier

Paper for fdemelo/xlm-roberta-ovos-intent-classifier

Unsupervised Cross-lingual Representation Learning at Scale

Paper • 1911.02116 • Published Nov 5, 2019 • 3