Model Card for Model ID

Model Description

Longformer-es-mental-large is a Spanish domain-adapted language model designed for the analysis of mental health–related content in long user-generated texts.

It is based on the Longformer architecture, which extends the standard Transformer attention mechanism to efficiently process long sequences. The model supports input sequences of up to 4096 tokens, enabling it to capture long-range dependencies and temporal patterns that are particularly relevant in mental health monitoring and early risk detection settings.

This model was obtained through domain-adaptive pre-training (DAP) on a large corpus of mental health–related texts translated into Spanish from Reddit communities focused on psychological support and mental health discussions. The adaptation process allows the model to better capture emotional expressions, self-disclosure patterns, and discourse structures characteristic of mental health narratives in Spanish.

Longformer-es-mental-large is released as a foundational model and does not include task-specific fine-tuning.

This is the model card of a 🤗 transformers model that has been pushed on the Hub.

  • Developed by: ELiRF group, VRAIN (Valencian Research Institute for Artificial Intelligence), Universitat Politècnica de València
  • Model type: Transformer-based masked language model (Longformer)
  • Language(s) (NLP): Spanish
  • License: Same as base model (PlanTL-GOB-ES models)
  • Finetuned from model: ELiRF/RoBERTa-es-mental-large

Uses

This model is intended for research purposes in the mental health NLP domain.

Direct Use

The model can be used directly as a feature extractor or encoder for Spanish mental health–related texts, particularly when long input sequences are required.

Downstream Use

The model is primarily intended to be fine-tuned for downstream tasks such as:

  • Mental disorder detection
  • Early risk detection
  • User-level and context-level classification
  • Analysis of long social media timelines related to psychological well-being

It has been evaluated in early detection benchmarks when fine-tuned using task-specific datasets and methodologies.

Out-of-Scope Use

  • Use on languages other than Spanish
  • High-stakes decision-making affecting individuals’ health or safety
  • Real-time intervention systems without human supervision

Bias, Risks, and Limitations

  • The training data originates from social media platforms, which may introduce demographic, cultural, and linguistic biases.
  • The data was automatically translated into Spanish, which may introduce translation artifacts or subtle semantic drift.
  • Mental health language is highly contextual and subjective; predictions may be unstable when evidence is limited.
  • The model does not provide explanations or clinical interpretations of its outputs.

How to Get Started with the Model

Use the code below to get started with the model.

  from transformers import AutoTokenizer, AutoModel
  
  tokenizer = AutoTokenizer.from_pretrained("ELiRF/Longformer-es-mental-large")
  model = AutoModel.from_pretrained("ELiRF/Longformer-es-mental-large")
  
  inputs = tokenizer(
      "Ejemplo de texto relacionado con salud mental.",
      return_tensors="pt",
      truncation=True,
      max_length=4096
  )
  
  outputs = model(**inputs)

Training Details

Training Data

The model was domain-adapted using a merged corpus composed of:

  • Reddit SuicideWatch and Mental Health Collection (SWMH)
  • Reddit Mental Health Narratives (RMHN)

All texts were automatically translated into Spanish using neural machine translation. The final dataset contains approximately 1.9 million posts from multiple mental health–related communities.

Training Procedure

The model was trained using domain-adaptive pre-training (DAP) with a masked language modeling objective. Pre-training was performed for 20 epochs using multiple GPUs, following the same procedure applied to the other foundational models described in the paper “Improving Mental Health Screening and Early Risk Detection in Spanish”.

No task-specific fine-tuning is included in this checkpoint.

Preprocessing [optional]

  • Automatic translation to Spanish
  • Text normalization and tokenization using the base model tokenizer

Training Hyperparameters

  • Training regime: fp 16 mixed precision

Speeds, Sizes, Times [optional]

  • Model size: ~435M parameters
  • Training duration: 4 days
  • Checkpoint size: 1.7GB

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated after fine-tuning on Spanish mental health benchmarks (e.g., MentalRisk shared tasks).

Factors

  • Disorder type
  • Amount of available user context
  • Early vs full-context settings

Metrics

  • Macro F1-score
  • ERDE (Early Risk Detection Error)
  • Latency True Positive (LTP)

Results

When fine-tuned on Spanish mental health benchmarks, Longformer-es-mental shows competitive performance and improves upon the state of the art in both full-context (user-level) and early detection mental health tasks.

Summary

Longformer-es-mental is a Spanish domain-adapted long-context language model for mental health text analysis. It is based on the Longformer architecture and supports input sequences of up to 4096 tokens, enabling the modeling of long user message histories. The model shows strong performance on Spanish mental health detection and early risk detection tasks when fine-tuned on domain-specific datasets.

Technical Specifications [optional]

Model Architecture and Objective

  • Longformer architecture
  • Masked Language Modeling objective

Compute Infrastructure

Hardware

  • Multiple NVIDIA A40 GPUs

Citation

This model is part of ongoing research currently under review. The final version of the paper will be linked once it is published.

Model Card Authors

ELiRF research group (VRAIN, Universitat Politècnica de València)

Downloads last month
14
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ELiRF/Longformer-es-mental-large

Finetuned
(4)
this model
Finetunes
3 models