Festooned
/

Multilingual-Restaurant-Reviews-Sentiment

@@ -21,7 +21,7 @@ pipeline_tag: text-classification
 Hey there! This isn't just _another_ sentiment model. This is a fine-tuned powerhouse specifically designed to understand the nuance of 1-to-5 star restaurant reviews across **5 different languages**.
-It was trained on a massive, perfectly balanced dataset of **400,000 real, human-written, reviews** and achieves state-of-the-art performance.
 ## ✨ Model Features
@@ -92,8 +92,26 @@ This model was trained as a **regression** task. It predicts a single number (li
 Since this is a regression model, the output is a single float number. You'll want to round it to get a final "star" rating.
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
 model_name = "Festooned/Multilingual-Restaurant-Reviews-Sentiment"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
@@ -106,11 +124,11 @@ model = AutoModelForSequenceClassification.from_pretrained(model_name)
 # Let's create a pipeline
 sentiment_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
-# Example reviews
 reviews = [
-    "This was the best pasta I've ever had in my life. Absolutely incredible.", # 5-star
-    "El servicio fue terrible y la comida tardó una hora en llegar.", # 1-star
-    "It was... fine. Nothing special, but not bad either." # 3-star
 ]
 # Get the raw predictions
@@ -125,18 +143,17 @@ print(raw_preds)
 # (Remember our labels are 0-4, so we add 1)
 # ---
 for text, pred in zip(reviews, raw_preds):
-    # 'score' is the raw regression value from 0-4
     raw_score = pred['score']
-    # Round to the nearest star
-    star_rating_rounded = round(raw_score) + 1
-    # Or just use the raw score!
-    star_rating_precise = raw_score + 1
-    print(f"Review: {text[:30]}...")
-    print(f"  Precise Rating: {star_rating_precise:.2f} stars")
-    print(f"  Rounded Rating: {star_rating_rounded} stars\n")
 ```
 ---

 Hey there! This isn't just _another_ sentiment model. This is a fine-tuned powerhouse specifically designed to understand the nuance of 1-to-5 star restaurant reviews across **5 different languages**.
+It was trained on a massive, perfectly balanced dataset of **400,000+ real, human-written, reviews** and achieves state-of-the-art performance.
 ## ✨ Model Features
 Since this is a regression model, the output is a single float number. You'll want to round it to get a final "star" rating.
+### ⚠️ A Critical Note on Input Format
+**This is very important for getting the best performance!**
+This model was not just trained on review text; it was trained using a specific format that includes **both the review title and the review text**, separated by the `[SEP]` token.
+The title often contains a powerful summary of the sentiment (e.g., "Best Pasta Ever!" or "Total Rip-off!"). Using this format ensures the model gets the same type of input it was trained on.
+**Correct Format:**
+`input_text = review_title + " [SEP] " + review_text`
+If you only have the review text, the model will still work well, but performance will be slightly lower.
+### Pipeline Usage Example
+Here is how you should format your inputs before passing them to the pipeline:
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
+import numpy as np # Make sure to import numpy
 model_name = "Festooned/Multilingual-Restaurant-Reviews-Sentiment"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 # Let's create a pipeline
 sentiment_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
+# Example reviews using the recommended format
 reviews = [
+    "Absolutely incredible [SEP] This was the best pasta I've ever had in my life.", # 5-star
+    "Servicio terrible [SEP] El servicio fue terrible y la comida tardó una hora en llegar.", # 1-star
+    "It was fine [SEP] It was... fine. Nothing special, but not bad either." # 3-star
 ]
 # Get the raw predictions
 # (Remember our labels are 0-4, so we add 1)
 # ---
 for text, pred in zip(reviews, raw_preds):
+    # 'score' is the raw regression value (our model predicts 0-4)
     raw_score = pred['score']
+    # Round and clamp to be safe (0-4)
+    star_label_rounded = np.clip(round(raw_score), 0, 4)
+    # Add 1 to get the 1-5 star rating
+    final_star_rating = int(star_label_rounded + 1)
+    print(f"Review: {text[:40]}...")
+    print(f"  Final Rating: {final_star_rating} stars\n")
 ```
 ---