File size: 1,307 Bytes
f9c775d
 
 
 
 
 
 
 
 
 
 
 
 
 
c53d1d3
f9c775d
8b8435d
f9c775d
8b8435d
 
 
 
f9c775d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
language:
- es
- en
tags:
- sentiment-analysis
- xlm-roberta
- multilingual
- movies
license: apache-2.0
base_model:
- FacebookAI/xlm-roberta-base
---

# XLM-R Sentiment EN/ES (Movie Reviews)

Clasificador binario (*Positive/Negative*) para rese帽as de pel铆culas en **ingl茅s y espa帽ol**, fine-tuned desde `xlm-roberta-base` con  **Rotten Tomatoes movies and critic reviews dataset** from [Kaggle](https://www.kaggle.com/datasets/stefanoleone992/rotten-tomatoes-movies-and-critic-reviews-dataset)

**M茅tricas:** 

Acc **0.8519** 路 F1 **0.8876** 路 Prec **0.8646** 路 Rec **0.9119** 路 AUC **0.9260**  
*Umbral recomendado:** **0.48*

## Uso r谩pido
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
m = "Ricardouchub/xlmr-sentiment-es-en"; thr = 0.48
tok = AutoTokenizer.from_pretrained(m, use_fast=True)
mdl = AutoModelForSequenceClassification.from_pretrained(m).eval()
enc = tok(["Excelente actuaci贸n, final predecible."], truncation=True, max_length=224, padding=True, return_tensors="pt")
p = torch.softmax(mdl(**enc).logits, dim=-1)[:,1].item()
print(("POSITIVE" if p>=thr else "NEGATIVE"), round(p*100,1), "%")
```

*Notas: split por pel铆cula (evita fuga); limpieza m铆nima de texto. No apto para usos sensibles.*

**Autor: Ricardo Urdaneta**