keanteng commited on
Commit
0cd4b4f
·
verified ·
1 Parent(s): 5712fed

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +125 -0
README.md ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: agpl-3.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-classification
7
+ - bert
8
+ - healthcare
9
+ - risk-assessment
10
+ - questionnaire-analysis
11
+ pipeline_tag: text-classification
12
+ ---
13
+
14
+ # BERT Classification Models for Healthcare Risk Assessment
15
+
16
+ This repository contains fine-tuned BERT models for classifying healthcare questionnaire responses into risk categories.
17
+
18
+ ## Model Description
19
+
20
+ Two BERT-base-uncased models have been fine-tuned for healthcare risk assessment:
21
+
22
+ 1. **Fatigue Model**: Classifies fatigue-related responses
23
+ 2. **Mental Health Model**: Classifies mental health-related responses
24
+
25
+ Both models predict three risk categories:
26
+ - **Low Risk** (0)
27
+ - **Moderate Risk** (1)
28
+ - **High Risk** (2)
29
+
30
+ ## Training Details
31
+
32
+ - **Base Model**: bert-base-uncased
33
+ - **Training Epochs**: 40
34
+ - **Batch Size**: 16
35
+ - **Learning Rate**: 2e-5
36
+ - **Optimizer**: AdamW
37
+ - **Max Sequence Length**: 128
38
+
39
+ ## Usage
40
+
41
+ ### Loading the Models
42
+
43
+ ```python
44
+ from transformers import BertTokenizer, BertForSequenceClassification
45
+ import torch
46
+
47
+ # Load tokenizer
48
+ tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
49
+
50
+ # Load fatigue model
51
+ fatigue_model = BertForSequenceClassification.from_pretrained('keanteng/bert-classification-wqd7005', subfolder='fatigue_model')
52
+
53
+ # Load mental health model
54
+ mental_health_model = BertForSequenceClassification.from_pretrained('keanteng/bert-classification-wqd7005', subfolder='mental_health_model')
55
+ ```
56
+
57
+ ### Making Predictions
58
+
59
+ ```python
60
+ def predict_risk(text, model, tokenizer, max_length=128):
61
+ # Tokenize input
62
+ inputs = tokenizer(
63
+ text,
64
+ padding='max_length',
65
+ truncation=True,
66
+ max_length=max_length,
67
+ return_tensors='pt'
68
+ )
69
+
70
+ # Make prediction
71
+ model.eval()
72
+ with torch.no_grad():
73
+ outputs = model(**inputs)
74
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
75
+ predicted_class = torch.argmax(predictions, dim=-1)
76
+
77
+ # Map to risk categories
78
+ risk_labels = ['Low Risk', 'Moderate Risk', 'High Risk']
79
+ return risk_labels[predicted_class.item()], predictions[0].tolist()
80
+
81
+ # Example usage
82
+ fatigue_text = "I feel extremely tired all the time and can't complete daily tasks"
83
+ risk_category, confidence_scores = predict_risk(fatigue_text, fatigue_model, tokenizer)
84
+ print(f"Risk Category: {risk_category}")
85
+ print(f"Confidence Scores: {confidence_scores}")
86
+ ```
87
+
88
+ ## Model Performance
89
+
90
+ The models were trained and evaluated on healthcare questionnaire data with the following label mapping:
91
+
92
+ **Fatigue Model:**
93
+ - Fatigue levels 1-2 → Low Risk
94
+ - Fatigue level 3 → Moderate Risk
95
+ - Fatigue levels 4-5 → High Risk
96
+
97
+ **Mental Health Model:**
98
+ - Mental health levels 1-2 → High Risk
99
+ - Mental health level 3 → Moderate Risk
100
+ - Mental health levels 4-5 → Low Risk
101
+
102
+ ## Training Data
103
+
104
+ The models were trained on questionnaire responses containing:
105
+ - Text descriptions of fatigue levels
106
+ - Text descriptions of mental health status
107
+ - Corresponding risk labels
108
+
109
+ Data was split 80/20 for training and validation with stratified sampling.
110
+
111
+ ## Intended Use
112
+
113
+ These models are designed for:
114
+ - Healthcare questionnaire analysis
115
+ - Risk assessment screening
116
+ - Research applications in healthcare NLP
117
+
118
+ **Important**: These models are for research and screening purposes only and should not replace professional medical diagnosis.
119
+
120
+ ## Limitations
121
+
122
+ - Models are trained on specific questionnaire formats
123
+ - Performance may vary on different populations or text styles
124
+ - Should be used as a screening tool, not for final diagnosis
125
+ - May have biases present in the training data