Xgboost - Game Review Sentiment Analysis
Model Description
This model performs sentiment analysis on game reviews, classifying them into three categories:
- Positive: Favorable reviews
- Mixed: Neutral or mixed sentiment reviews
- Negative: Unfavorable reviews
Model Type: Xgboost
Training Date: 2025-11-09
Performance
Test Set Metrics
| Metric | Score |
|---|---|
| Accuracy | 0.8471 |
| F1-Score | 0.8472 |
| Precision | 0.8474 |
| Recall | 0.8471 |
Training Information
- Training Time: 1490.49 seconds
- Training Samples: 629,884
- Validation Samples: 78,735
- Test Samples: 78,737
Model Configuration
{
"model_name": "XGBoost",
"embedding_model": "BAAI/bge-m3",
"n_estimators": 3000,
"max_depth": 6,
"learning_rate": 0.1,
"subsample": 1.0,
"colsample_bytree": 1.0,
"subset": 1.0
}
Usage
Loading the Model
from pathlib import Path
import pickle
# Load the model components
model_dir = Path("path/to/model")
with open(model_dir / 'vectorizer.pkl', 'rb') as f:
vectorizer = pickle.load(f)
with open(model_dir / 'classifier.pkl', 'rb') as f:
classifier = pickle.load(f)
with open(model_dir / 'label_encoder.pkl', 'rb') as f:
label_encoder = pickle.load(f)
Making Predictions
# Example reviews
reviews = [
"This game is absolutely amazing! Best game I've played this year.",
"It's okay, nothing special but not terrible either.",
"Terrible game, waste of money and time."
]
# Transform and predict
X = vectorizer.transform(reviews)
predictions_encoded = classifier.predict(X)
predictions = label_encoder.inverse_transform(predictions_encoded)
print(predictions)
# Output: ['positive', 'mixed', 'negative']
# Get probabilities
probabilities = classifier.predict_proba(X)
print(probabilities)
Per-Class Performance
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Positive | 0.9242 | 0.9167 | 0.9204 | 45859 |
| Mixed | 0.5851 | 0.5840 | 0.5845 | 12697 |
| Negative | 0.8381 | 0.8546 | 0.8463 | 20181 |
Feature Importance
The model identifies important words/phrases for each sentiment class. See results.json for the complete feature importance analysis.
Limitations
- The model is trained specifically on game reviews and may not generalize well to other domains
- Performance may vary on reviews with sarcasm or nuanced sentiments
- The model treats text as bag-of-words and doesn't capture word order
Training Details
This model was trained as part of a game review sentiment analysis project. For more information, see the project repository.
Files
vectorizer.pkl: TF-IDF vectorizerclassifier.pkl: Trained classifierlabel_encoder.pkl: Label encoder for sentiment classesconfig.json: Model configurationresults.json: Complete training results and metrics
Citation
If you use this model, please cite:
@misc{game_review_sentiment,
author = {Game Review Sentiment Analysis Project},
title = {Sentiment Analysis Model for Game Reviews},
year = {2025},
url = {https://huggingface.co/xgboost}
}
- Downloads last month
- -