mini-sentiment-transformer / README.md

leorigasaki54

Upload mini sentiment transformer

fc534c9 verified 4 months ago

preview code

raw

history blame contribute delete

2.32 kB

metadata

language: en
license: mit
library_name: transformers
tags:
  - sentiment-analysis
  - text-classification
  - transformers
  - mini-transformer
datasets:
  - glue/sst2
model-index:
  - name: mini-sentiment-transformer
    results:
      - task:
          type: text-classification
          name: Sentiment Analysis
        dataset:
          name: SST-2
          type: glue
          args: sst2
        metrics:
          - type: accuracy
            value: 0.8154
            name: Validation Accuracy

Mini Sentiment Transformer

This is a tiny transformer model for sentiment analysis, created as a learning project to understand transformer architecture. It's much smaller than BERT or DistilBERT, with only around 4,188,802 parameters.

Model Details

Developed by: leorigasaki54
Type: Text Classification (Sentiment Analysis)
Language: English
Training Data: SST-2 (Stanford Sentiment Treebank)
Size: 4,188,802 parameters (4.19M)
Architecture:
- 2 transformer encoder layers
- 2 attention heads per layer
- 128 embedding dimensions
- 256 feed-forward dimensions

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch.nn.functional as F

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")  # We use DistilBERT tokenizer
model = AutoModelForSequenceClassification.from_pretrained("leorigasaki54/mini-sentiment-transformer")

# Prepare input
text = "I really enjoyed this movie!"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    probabilities = F.softmax(outputs.logits, dim=-1)
    prediction = torch.argmax(probabilities, dim=-1).item()

sentiment = "Positive" if prediction == 1 else "Negative"
confidence = probabilities[0][prediction].item()

print(f"Sentiment: {sentiment} (confidence: {confidence:.4f})")

Limitations

This is a minimal implementation meant for educational purposes
Performance may be lower than larger models like BERT or DistilBERT
The model has been trained only on movie reviews and may not generalize well to other domains
Limited to English language text only

Training

The model was trained on the SST-2 dataset for 5 epochs using Adam optimizer with a learning rate of 5e-5.