Longformer: The Long-Document Transformer
Paper
•
2004.05150
•
Published
•
4
xlm-roberta-longformer is a multilingual Longformer initialized with XLM-RoBERTa's weights without further pretraining. It is intended to be fine-tuned on a downstream task.
| Model | attention_window | hidden_size | num_hidden_layers | model_max_length |
|---|---|---|---|---|
| base | 256 | 768 | 12 | 16384 |
| large | 512 | 1024 | 24 | 16384 |