File size: 5,869 Bytes

---
language: en
license: mit
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- motivational-interviewing
- bert
- mental-health
- counseling
- psychology
- transformers
- pytorch
datasets:
- AnnoMI
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: bert-motivational-interviewing
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: AnnoMI
      type: AnnoMI
    metrics:
    - type: accuracy
      value: 0.701
      name: Accuracy
    - type: f1
      value: 0.579
      name: F1 Score (macro)
widget:
- text: "I really want to quit smoking."
  example_title: "Change Talk"
- text: "I don't know if I can do this."
  example_title: "Neutral"
- text: "I like smoking, it helps me relax."
  example_title: "Sustain Talk"
---

# BERT for Motivational Interviewing Client Talk Classification

## Model Description

This model is a fine-tuned **BERT-base-uncased** model for classifying client utterances in **Motivational Interviewing (MI)** conversations. 

Motivational Interviewing is a counseling approach used to help individuals overcome ambivalence and make positive behavioral changes. This model identifies different types of client talk that indicate their readiness for change.

## Intended Use

- **Primary Use**: Classify client statements in motivational interviewing dialogues
- **Applications**: 
  - Counselor training and feedback
  - MI session analysis
  - Automated dialogue systems
  - Mental health research

## Training Data

The model was trained on the **AnnoMI dataset** (Annotated Motivational Interviewing), which contains expert-annotated counseling dialogues.

- **Training samples**: ~2,400 utterances
- **Validation samples**: ~500 utterances  
- **Test samples**: ~700 utterances

## Labels

The model classifies client talk into three categories:

- **0**: change
- **1**: neutral
- **2**: sustain

### Label Definitions

- **Change Talk**: Client statements expressing desire, ability, reasons, or need for change
  - Example: "I really want to quit smoking" or "I think I can do it"

- **Neutral**: General responses without clear indication of change or sustain
  - Example: "I don't know" or "Maybe"

- **Sustain Talk**: Client statements expressing reasons for maintaining current behavior
  - Example: "I like smoking, it helps me relax"

## Performance

### Test Set Metrics

- **Accuracy**: 70.1%
- **Macro F1**: 57.9%
- **Macro Precision**: 59.3%
- **Macro Recall**: 57.3%

### Confusion Matrix

```
              Predicted
              change  neutral  sustain
Actual change    75      78       23
       neutral   43     396       27  
       sustain   11      34       36
```

**Note**: The model performs best on the "neutral" class (most frequent), and has room for improvement on "change" and "sustain" classes.

## Usage

### Quick Start

```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load model and tokenizer
model_name = "RyanDDD/bert-motivational-interviewing"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)

# Predict
text = "I really want to quit smoking. It's been affecting my health."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    pred = torch.argmax(probs, dim=1)

label_map = model.config.id2label
print(f"Talk type: {label_map[pred.item()]}")
print(f"Confidence: {probs[0][pred].item():.2%}")
```

### Batch Prediction

```python
texts = [
    "I want to stop drinking.",
    "I don't think I have a problem.",
    "I like drinking with my friends."
]

inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    preds = torch.argmax(probs, dim=1)

for text, pred, prob in zip(texts, preds, probs):
    label = model.config.id2label[pred.item()]
    confidence = prob[pred].item()
    print(f"Text: {text}")
    print(f"Type: {label} ({confidence:.1%})")
    print()
```

## Training Details

### Hyperparameters

- **Base model**: `bert-base-uncased`
- **Max sequence length**: 128 tokens
- **Batch size**: 16
- **Learning rate**: 2e-5
- **Epochs**: 5
- **Optimizer**: AdamW
- **Loss**: Cross-entropy

### Hardware

Trained on a single GPU (NVIDIA GPU recommended).

## Limitations

1. **Class Imbalance**: The model performs better on "neutral" (majority class) than "change" and "sustain"
2. **Context**: The model classifies single utterances without conversation context
3. **Domain**: Trained specifically on MI conversations; may not generalize to other counseling types
4. **Language**: English only

## Ethical Considerations

- This model is intended to **assist**, not replace, human counselors
- Predictions should be reviewed by qualified professionals
- Privacy and confidentiality must be maintained when processing real counseling data
- Be aware of potential biases in training data

## Citation

If you use this model, please cite:

```bibtex
@misc{bert-mi-classifier-2024,
  author = {Ryan},
  title = {BERT for Motivational Interviewing Client Talk Classification},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/RyanDDD/bert-motivational-interviewing}}
}
```

## References

- **AnnoMI Dataset**: [GitHub](https://github.com/uccollab/AnnoMI)
- **BERT Paper**: [Devlin et al., 2019](https://arxiv.org/abs/1810.04805)
- **Motivational Interviewing**: [Miller & Rollnick, 2012](https://motivationalinterviewing.org/)

## Model Card Contact

For questions or feedback, please open an issue in the model repository.