|
|
--- |
|
|
language: en |
|
|
license: mit |
|
|
library_name: transformers |
|
|
tags: |
|
|
- sentiment-analysis |
|
|
- text-classification |
|
|
- transformers |
|
|
- mini-transformer |
|
|
datasets: |
|
|
- glue/sst2 |
|
|
model-index: |
|
|
- name: mini-sentiment-transformer |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Sentiment Analysis |
|
|
dataset: |
|
|
name: SST-2 |
|
|
type: glue |
|
|
args: sst2 |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.8154 |
|
|
name: Validation Accuracy |
|
|
--- |
|
|
|
|
|
# Mini Sentiment Transformer |
|
|
|
|
|
This is a tiny transformer model for sentiment analysis, created as a learning project to understand transformer architecture. It's much smaller than BERT or DistilBERT, with only around 4,188,802 parameters. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- Developed by: leorigasaki54 |
|
|
- Type: Text Classification (Sentiment Analysis) |
|
|
- Language: English |
|
|
- Training Data: SST-2 (Stanford Sentiment Treebank) |
|
|
- Size: 4,188,802 parameters (4.19M) |
|
|
- Architecture: |
|
|
- 2 transformer encoder layers |
|
|
- 2 attention heads per layer |
|
|
- 128 embedding dimensions |
|
|
- 256 feed-forward dimensions |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch.nn.functional as F |
|
|
|
|
|
# Load tokenizer and model |
|
|
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") # We use DistilBERT tokenizer |
|
|
model = AutoModelForSequenceClassification.from_pretrained("leorigasaki54/mini-sentiment-transformer") |
|
|
|
|
|
# Prepare input |
|
|
text = "I really enjoyed this movie!" |
|
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64) |
|
|
|
|
|
# Make prediction |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
probabilities = F.softmax(outputs.logits, dim=-1) |
|
|
prediction = torch.argmax(probabilities, dim=-1).item() |
|
|
|
|
|
sentiment = "Positive" if prediction == 1 else "Negative" |
|
|
confidence = probabilities[0][prediction].item() |
|
|
|
|
|
print(f"Sentiment: {sentiment} (confidence: {confidence:.4f})") |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- This is a minimal implementation meant for educational purposes |
|
|
- Performance may be lower than larger models like BERT or DistilBERT |
|
|
- The model has been trained only on movie reviews and may not generalize well to other domains |
|
|
- Limited to English language text only |
|
|
|
|
|
## Training |
|
|
|
|
|
The model was trained on the SST-2 dataset for 5 epochs using Adam optimizer with a learning rate of 5e-5. |
|
|
|