Model Card for bert-imdb-sentiment

This is a fine-tuned bert-base-uncased model for binary sentiment classification on the IMDb movie reviews dataset.
The model predicts whether a given movie review is positive or negative.

Model Details

Model Description

This model is a BertForSequenceClassification model fine-tuned using Hugging Face Transformers and the IMDb dataset (25,000 movie reviews).
The training was done using the Trainer API with the following configuration:

Tokenization with BertTokenizer (bert-base-uncased), max sequence length of 256.
Fine-tuned for 3 epochs with learning rate 2e-5 and mixed-precision (fp16).
Achieved ~91.54% accuracy and F1 score of ~91.54% on the test split.
Developed by: koushik reddy
Model type: Transformer-based sequence classifier (BertForSequenceClassification)
Language(s) (NLP): English
Finetuned from model : bert-base-uncased (Hugging Face link)

Model Sources

Repository: https://huggingface.co/koushik-25/bert-imdb-sentiment
Paper : Original BERT paper: Devlin et al., 2018 (https://arxiv.org/abs/1810.04805)
Demo : You can test it directly using the Inference Widget on the model page.

Intended Uses & Limitations

✅ Intended for sentiment classification of English movie reviews.
⚠️ May not generalize well to other domains (e.g., tweets, product reviews) without additional fine-tuning.
⚠️ May reflect biases present in the IMDb dataset and the original BERT pre-training corpus.

Direct Use

from transformers import BertForSequenceClassification, BertTokenizer
import torch

# Load model from the Hub
model = BertForSequenceClassification.from_pretrained("your-username/bert-imdb-sentiment")
tokenizer = BertTokenizer.from_pretrained("your-username/bert-imdb-sentiment")

# Inference
inputs = tokenizer("The movie was fantastic!", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
pred = torch.argmax(logits, dim=1).item()
print(["NEGATIVE", "POSITIVE"][pred])

Training Details

Training Data

Dataset: IMDb movie reviews (datasets.load_dataset('imdb')).
Size: 25,000 training, 25,000 test samples.
Preprocessing: Tokenization with max_length=256 chosen based on review length histogram.

Training Procedure

Preprocessing

Text was lowercased automatically because bert-base-uncased is a lowercase model.
Each example was tokenized with padding to max_length=256 and truncated if longer.
The dataset was split into train, validation, and test using:
- train: 0–20,000 samples from the training set
- val: 20,000–25,000 samples from the training set
- test: the official IMDb test split

Training Hyperparameters

Base Model: bert-base-uncased
Num Labels: 2 (binary classification)
Batch size: 4 per device (with gradient accumulation of 16 steps, so effective batch size = 64)
Learning Rate: 2e-5
Epochs: 3
Optimizer: AdamW (default in Transformers)
Mixed Precision: fp16 mixed precision training enabled for faster training and reduced memory usage (fp16=True in TrainingArguments)
Scheduler: Linear learning rate scheduler with warmup (default)
Seed: 224

Speeds, Sizes, Times

Training Time: Approx. varies by GPU; typically around 15-20 minutes on T4 GPU
Checkpoint Size: ~420MB for pytorch_model.bin (BERT base size plus classification head).
Total Parameters: ~110 million.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Dataset: IMDb test split (25,000 reviews) held out from training.
Preprocessing: Same as training — lowercased, tokenized with max_length=256.

Factors

This model was evaluated on the overall IMDb test set only. No specific subgroup or domain disaggregation was done.
The model is expected to generalize well to similar English movie review sentiment but may not be robust to domain shifts.

Metrics

Accuracy: Measures the fraction of correctly classified reviews.
F1 Score: Weighted average F1 across classes to balance precision and recall.

Evaluation Results

Metric	Score
Accuracy	91.54%
F1 Score	91.54%

Evaluated on the IMDb test set.

Summary

This is a fine-tuned BERT model (bert-base-uncased) for binary sentiment analysis on the IMDb movie reviews dataset.
It classifies a given movie review as positive or negative with an accuracy of 91.54% and a weighted F1 score of 91.54% on the test set.
The model was trained using the Hugging Face transformers library, with tokenization based on a maximum sequence length of 256 tokens to balance coverage and efficiency.

The model is intended for English movie reviews but may generalize reasonably to similar sentiment analysis tasks on longer-form English text.

Downloads last month: 21

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for koushik-25/bert-imdb-sentiment

Base model

google-bert/bert-base-uncased

Finetuned

(6055)

this model

koushik-25
/

bert-imdb-sentiment