RyanDDD's picture
Add pipeline_tag to enable Inference API
5e377fb verified
metadata
language: en
license: mit
library_name: transformers
pipeline_tag: text-classification
tags:
  - text-classification
  - motivational-interviewing
  - bert
  - mental-health
  - counseling
  - psychology
  - transformers
  - pytorch
datasets:
  - AnnoMI
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: bert-motivational-interviewing
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: AnnoMI
          type: AnnoMI
        metrics:
          - type: accuracy
            value: 0.701
            name: Accuracy
          - type: f1
            value: 0.579
            name: F1 Score (macro)
widget:
  - text: I really want to quit smoking.
    example_title: Change Talk
  - text: I don't know if I can do this.
    example_title: Neutral
  - text: I like smoking, it helps me relax.
    example_title: Sustain Talk

BERT for Motivational Interviewing Client Talk Classification

Model Description

This model is a fine-tuned BERT-base-uncased model for classifying client utterances in Motivational Interviewing (MI) conversations.

Motivational Interviewing is a counseling approach used to help individuals overcome ambivalence and make positive behavioral changes. This model identifies different types of client talk that indicate their readiness for change.

Intended Use

  • Primary Use: Classify client statements in motivational interviewing dialogues
  • Applications:
    • Counselor training and feedback
    • MI session analysis
    • Automated dialogue systems
    • Mental health research

Training Data

The model was trained on the AnnoMI dataset (Annotated Motivational Interviewing), which contains expert-annotated counseling dialogues.

  • Training samples: ~2,400 utterances
  • Validation samples: ~500 utterances
  • Test samples: ~700 utterances

Labels

The model classifies client talk into three categories:

  • 0: change
  • 1: neutral
  • 2: sustain

Label Definitions

  • Change Talk: Client statements expressing desire, ability, reasons, or need for change

    • Example: "I really want to quit smoking" or "I think I can do it"
  • Neutral: General responses without clear indication of change or sustain

    • Example: "I don't know" or "Maybe"
  • Sustain Talk: Client statements expressing reasons for maintaining current behavior

    • Example: "I like smoking, it helps me relax"

Performance

Test Set Metrics

  • Accuracy: 70.1%
  • Macro F1: 57.9%
  • Macro Precision: 59.3%
  • Macro Recall: 57.3%

Confusion Matrix

              Predicted
              change  neutral  sustain
Actual change    75      78       23
       neutral   43     396       27  
       sustain   11      34       36

Note: The model performs best on the "neutral" class (most frequent), and has room for improvement on "change" and "sustain" classes.

Usage

Quick Start

from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load model and tokenizer
model_name = "RyanDDD/bert-motivational-interviewing"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)

# Predict
text = "I really want to quit smoking. It's been affecting my health."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    pred = torch.argmax(probs, dim=1)

label_map = model.config.id2label
print(f"Talk type: {label_map[pred.item()]}")
print(f"Confidence: {probs[0][pred].item():.2%}")

Batch Prediction

texts = [
    "I want to stop drinking.",
    "I don't think I have a problem.",
    "I like drinking with my friends."
]

inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    preds = torch.argmax(probs, dim=1)

for text, pred, prob in zip(texts, preds, probs):
    label = model.config.id2label[pred.item()]
    confidence = prob[pred].item()
    print(f"Text: {text}")
    print(f"Type: {label} ({confidence:.1%})")
    print()

Training Details

Hyperparameters

  • Base model: bert-base-uncased
  • Max sequence length: 128 tokens
  • Batch size: 16
  • Learning rate: 2e-5
  • Epochs: 5
  • Optimizer: AdamW
  • Loss: Cross-entropy

Hardware

Trained on a single GPU (NVIDIA GPU recommended).

Limitations

  1. Class Imbalance: The model performs better on "neutral" (majority class) than "change" and "sustain"
  2. Context: The model classifies single utterances without conversation context
  3. Domain: Trained specifically on MI conversations; may not generalize to other counseling types
  4. Language: English only

Ethical Considerations

  • This model is intended to assist, not replace, human counselors
  • Predictions should be reviewed by qualified professionals
  • Privacy and confidentiality must be maintained when processing real counseling data
  • Be aware of potential biases in training data

Citation

If you use this model, please cite:

@misc{bert-mi-classifier-2024,
  author = {Ryan},
  title = {BERT for Motivational Interviewing Client Talk Classification},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/RyanDDD/bert-motivational-interviewing}}
}

References

Model Card Contact

For questions or feedback, please open an issue in the model repository.