RyanDDD's picture
Add pipeline_tag to enable Inference API
5e377fb verified
---
language: en
license: mit
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- motivational-interviewing
- bert
- mental-health
- counseling
- psychology
- transformers
- pytorch
datasets:
- AnnoMI
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: bert-motivational-interviewing
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: AnnoMI
type: AnnoMI
metrics:
- type: accuracy
value: 0.701
name: Accuracy
- type: f1
value: 0.579
name: F1 Score (macro)
widget:
- text: "I really want to quit smoking."
example_title: "Change Talk"
- text: "I don't know if I can do this."
example_title: "Neutral"
- text: "I like smoking, it helps me relax."
example_title: "Sustain Talk"
---
# BERT for Motivational Interviewing Client Talk Classification
## Model Description
This model is a fine-tuned **BERT-base-uncased** model for classifying client utterances in **Motivational Interviewing (MI)** conversations.
Motivational Interviewing is a counseling approach used to help individuals overcome ambivalence and make positive behavioral changes. This model identifies different types of client talk that indicate their readiness for change.
## Intended Use
- **Primary Use**: Classify client statements in motivational interviewing dialogues
- **Applications**:
- Counselor training and feedback
- MI session analysis
- Automated dialogue systems
- Mental health research
## Training Data
The model was trained on the **AnnoMI dataset** (Annotated Motivational Interviewing), which contains expert-annotated counseling dialogues.
- **Training samples**: ~2,400 utterances
- **Validation samples**: ~500 utterances
- **Test samples**: ~700 utterances
## Labels
The model classifies client talk into three categories:
- **0**: change
- **1**: neutral
- **2**: sustain
### Label Definitions
- **Change Talk**: Client statements expressing desire, ability, reasons, or need for change
- Example: "I really want to quit smoking" or "I think I can do it"
- **Neutral**: General responses without clear indication of change or sustain
- Example: "I don't know" or "Maybe"
- **Sustain Talk**: Client statements expressing reasons for maintaining current behavior
- Example: "I like smoking, it helps me relax"
## Performance
### Test Set Metrics
- **Accuracy**: 70.1%
- **Macro F1**: 57.9%
- **Macro Precision**: 59.3%
- **Macro Recall**: 57.3%
### Confusion Matrix
```
Predicted
change neutral sustain
Actual change 75 78 23
neutral 43 396 27
sustain 11 34 36
```
**Note**: The model performs best on the "neutral" class (most frequent), and has room for improvement on "change" and "sustain" classes.
## Usage
### Quick Start
```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch
# Load model and tokenizer
model_name = "RyanDDD/bert-motivational-interviewing"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
# Predict
text = "I really want to quit smoking. It's been affecting my health."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
pred = torch.argmax(probs, dim=1)
label_map = model.config.id2label
print(f"Talk type: {label_map[pred.item()]}")
print(f"Confidence: {probs[0][pred].item():.2%}")
```
### Batch Prediction
```python
texts = [
"I want to stop drinking.",
"I don't think I have a problem.",
"I like drinking with my friends."
]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
preds = torch.argmax(probs, dim=1)
for text, pred, prob in zip(texts, preds, probs):
label = model.config.id2label[pred.item()]
confidence = prob[pred].item()
print(f"Text: {text}")
print(f"Type: {label} ({confidence:.1%})")
print()
```
## Training Details
### Hyperparameters
- **Base model**: `bert-base-uncased`
- **Max sequence length**: 128 tokens
- **Batch size**: 16
- **Learning rate**: 2e-5
- **Epochs**: 5
- **Optimizer**: AdamW
- **Loss**: Cross-entropy
### Hardware
Trained on a single GPU (NVIDIA GPU recommended).
## Limitations
1. **Class Imbalance**: The model performs better on "neutral" (majority class) than "change" and "sustain"
2. **Context**: The model classifies single utterances without conversation context
3. **Domain**: Trained specifically on MI conversations; may not generalize to other counseling types
4. **Language**: English only
## Ethical Considerations
- This model is intended to **assist**, not replace, human counselors
- Predictions should be reviewed by qualified professionals
- Privacy and confidentiality must be maintained when processing real counseling data
- Be aware of potential biases in training data
## Citation
If you use this model, please cite:
```bibtex
@misc{bert-mi-classifier-2024,
author = {Ryan},
title = {BERT for Motivational Interviewing Client Talk Classification},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/RyanDDD/bert-motivational-interviewing}}
}
```
## References
- **AnnoMI Dataset**: [GitHub](https://github.com/uccollab/AnnoMI)
- **BERT Paper**: [Devlin et al., 2019](https://arxiv.org/abs/1810.04805)
- **Motivational Interviewing**: [Miller & Rollnick, 2012](https://motivationalinterviewing.org/)
## Model Card Contact
For questions or feedback, please open an issue in the model repository.