leorigasaki54
/

mini-sentiment-transformer

Text Classification

mini-sentiment-transformer

sentiment-analysis

mini-transformer

Model card Files Files and versions

Metrics Training metrics Community

mini-sentiment-transformer / README.md

leorigasaki54's picture

Upload mini sentiment transformer

fc534c9 verified 4 months ago

|

history blame contribute delete

2.32 kB

	---
	language: en
	license: mit
	library_name: transformers
	tags:
	- sentiment-analysis
	- text-classification
	- transformers
	- mini-transformer
	datasets:
	- glue/sst2
	model-index:
	- name: mini-sentiment-transformer
	results:
	- task:
	type: text-classification
	name: Sentiment Analysis
	dataset:
	name: SST-2
	type: glue
	args: sst2
	metrics:
	- type: accuracy
	value: 0.8154
	name: Validation Accuracy
	---

	# Mini Sentiment Transformer

	This is a tiny transformer model for sentiment analysis, created as a learning project to understand transformer architecture. It's much smaller than BERT or DistilBERT, with only around 4,188,802 parameters.

	## Model Details

	- Developed by: leorigasaki54
	- Type: Text Classification (Sentiment Analysis)
	- Language: English
	- Training Data: SST-2 (Stanford Sentiment Treebank)
	- Size: 4,188,802 parameters (4.19M)
	- Architecture:
	- 2 transformer encoder layers
	- 2 attention heads per layer
	- 128 embedding dimensions
	- 256 feed-forward dimensions

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch.nn.functional as F

	# Load tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") # We use DistilBERT tokenizer
	model = AutoModelForSequenceClassification.from_pretrained("leorigasaki54/mini-sentiment-transformer")

	# Prepare input
	text = "I really enjoyed this movie!"
	inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64)

	# Make prediction
	with torch.no_grad():
	outputs = model(**inputs)
	probabilities = F.softmax(outputs.logits, dim=-1)
	prediction = torch.argmax(probabilities, dim=-1).item()

	sentiment = "Positive" if prediction == 1 else "Negative"
	confidence = probabilities[0][prediction].item()

	print(f"Sentiment: {sentiment} (confidence: {confidence:.4f})")
	```

	## Limitations

	- This is a minimal implementation meant for educational purposes
	- Performance may be lower than larger models like BERT or DistilBERT
	- The model has been trained only on movie reviews and may not generalize well to other domains
	- Limited to English language text only

	## Training

	The model was trained on the SST-2 dataset for 5 epochs using Adam optimizer with a learning rate of 5e-5.