Priority Classification Model (Nepali + English Hybrid)
Model Overview
This model automatically classifies citizen complaints or service requests into priority levels — HIGH, MEDIUM, or LOW — based on the urgency and nature of the text.
It supports both Nepali and English inputs and uses a hybrid ML + rule-based approach to ensure robustness, especially on small datasets.
Model Architecture
| Component |
Description |
| Embedder |
sentence-transformers/all-MiniLM-L6-v2 |
| Classifier |
Logistic Regression (multiclass, balanced weights) |
| Rule-based Layer |
Keyword-based fallback for urgency terms in Nepali and English |
| Features |
SBERT embeddings + priority keyword preservation |
| Hybrid Inference |
Combines ML prediction confidence with rules for safer decisions |
Training Summary
| Metric |
Value |
| Total raw samples |
266 |
| After preprocessing & augmentation |
594 |
| Train/Test Split |
445 / 149 |
| Embedding Dimension |
384 |
| Classes |
HIGH, MEDIUM, LOW |
| Test Accuracy |
72.5% |
| Macro F1-score |
0.72 |
Label Distribution (After Normalization)
| Label |
Count |
| HIGH |
203 |
| MEDIUM |
29 |
| LOW |
34 |
Label Distribution (After Augmentation)
| Label |
Count |
| HIGH |
200 |
| MEDIUM |
194 |
| LOW |
200 |
Classification Report
| Class |
Precision |
Recall |
F1 |
Support |
| HIGH |
0.73 |
0.66 |
0.69 |
50 |
| MEDIUM |
0.74 |
0.80 |
0.76 |
49 |
| LOW |
0.71 |
0.72 |
0.71 |
50 |
| Overall Accuracy |
|
|
0.725 |
149 |
Performance is acceptable (≥70%) given dataset size.
The model performs best on clearly marked “urgent/emergency” cases and slightly lower on borderline MEDIUM cases.
Inference (Usage)
Using the model directly (ML only or Hybrid)
from huggingface_hub import hf_hub_download
import joblib
from priority_det import Embedder, predict_priority
model_path = hf_hub_download(repo_id="your-username/priority-classifier", filename="classifier.joblib")
bundle = joblib.load(model_path)
clf = bundle["clf"]
label_map = bundle["label_map"]
embedder = Embedder()
text = "पानी आपूर्ति बन्द छ। तत्काल समाधान चाहिन्छ।"
result = predict_priority(text, embedder, clf, label_map, use_hybrid=True)
print(result)