Model Card for bert-imdb-sentiment
This is a fine-tuned bert-base-uncased model for binary sentiment classification on the IMDb movie reviews dataset.
The model predicts whether a given movie review is positive or negative.
Model Details
Model Description
This model is a BertForSequenceClassification model fine-tuned using Hugging Face Transformers and the IMDb dataset (25,000 movie reviews).
The training was done using the Trainer API with the following configuration:
Tokenization with
BertTokenizer(bert-base-uncased), max sequence length of 256.Fine-tuned for 3 epochs with learning rate
2e-5and mixed-precision (fp16).Achieved ~91.54% accuracy and F1 score of ~91.54% on the test split.
Developed by: koushik reddy
Model type: Transformer-based sequence classifier (
BertForSequenceClassification)Language(s) (NLP): English
Finetuned from model :
bert-base-uncased(Hugging Face link)
Model Sources
- Repository: https://huggingface.co/koushik-25/bert-imdb-sentiment
- Paper : Original BERT paper: Devlin et al., 2018 (https://arxiv.org/abs/1810.04805)
- Demo : You can test it directly using the Inference Widget on the model page.
Intended Uses & Limitations
- ✅ Intended for sentiment classification of English movie reviews.
- ⚠️ May not generalize well to other domains (e.g., tweets, product reviews) without additional fine-tuning.
- ⚠️ May reflect biases present in the IMDb dataset and the original BERT pre-training corpus.
Direct Use
from transformers import BertForSequenceClassification, BertTokenizer
import torch
# Load model from the Hub
model = BertForSequenceClassification.from_pretrained("your-username/bert-imdb-sentiment")
tokenizer = BertTokenizer.from_pretrained("your-username/bert-imdb-sentiment")
# Inference
inputs = tokenizer("The movie was fantastic!", return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred = torch.argmax(logits, dim=1).item()
print(["NEGATIVE", "POSITIVE"][pred])
Training Details
Training Data
- Dataset: IMDb movie reviews (
datasets.load_dataset('imdb')). - Size: 25,000 training, 25,000 test samples.
- Preprocessing: Tokenization with
max_length=256chosen based on review length histogram.
Training Procedure
Preprocessing
- Text was lowercased automatically because
bert-base-uncasedis a lowercase model. - Each example was tokenized with padding to
max_length=256and truncated if longer. - The dataset was split into train, validation, and test using:
train: 0–20,000 samples from the training setval: 20,000–25,000 samples from the training settest: the official IMDb test split
Training Hyperparameters
- Base Model:
bert-base-uncased - Num Labels: 2 (binary classification)
- Batch size: 4 per device (with gradient accumulation of 16 steps, so effective batch size = 64)
- Learning Rate: 2e-5
- Epochs: 3
- Optimizer: AdamW (default in Transformers)
- Mixed Precision: fp16 mixed precision training enabled for faster training and reduced memory usage (
fp16=TrueinTrainingArguments) - Scheduler: Linear learning rate scheduler with warmup (default)
- Seed: 224
Speeds, Sizes, Times
- Training Time: Approx. varies by GPU; typically around 15-20 minutes on T4 GPU
- Checkpoint Size: ~420MB for
pytorch_model.bin(BERT base size plus classification head). - Total Parameters: ~110 million.
Evaluation
Testing Data, Factors & Metrics
Testing Data
- Dataset: IMDb test split (25,000 reviews) held out from training.
- Preprocessing: Same as training — lowercased, tokenized with
max_length=256.
Factors
- This model was evaluated on the overall IMDb test set only. No specific subgroup or domain disaggregation was done.
- The model is expected to generalize well to similar English movie review sentiment but may not be robust to domain shifts.
Metrics
- Accuracy: Measures the fraction of correctly classified reviews.
- F1 Score: Weighted average F1 across classes to balance precision and recall.
Evaluation Results
| Metric | Score |
|---|---|
| Accuracy | 91.54% |
| F1 Score | 91.54% |
Evaluated on the IMDb test set.
Summary
This is a fine-tuned BERT model (bert-base-uncased) for binary sentiment analysis on the IMDb movie reviews dataset.
It classifies a given movie review as positive or negative with an accuracy of 91.54% and a weighted F1 score of 91.54% on the test set.
The model was trained using the Hugging Face transformers library, with tokenization based on a maximum sequence length of 256 tokens to balance coverage and efficiency.
The model is intended for English movie reviews but may generalize reasonably to similar sentiment analysis tasks on longer-form English text.
- Downloads last month
- 21
Model tree for koushik-25/bert-imdb-sentiment
Base model
google-bert/bert-base-uncased