AventIQ-AI
/

English-To-Chinese

Model card Files Files and versions

DeepakKumarMSL commited on Jun 10

Commit

6acce8a

·

verified ·

1 Parent(s): 5a75830

Create README.md

Files changed (1) hide show

README.md +63 -0

README.md ADDED Viewed

	@@ -0,0 +1,63 @@

+# English to Chinese Translation (Quantized Model)
+This repository contains a **quantized English-to-Chinese translation model** fine-tuned on the ['wlhb/Transaltion-Chinese-2-English'] dataset and optimized using **dynamic quantization** for efficient CPU inference.
+## 🔧 Model Details
+- **Base model**: Helsinki-NLP/opus-mt-en-zh
+- **Dataset**: ['wlhb/Transaltion-Chinese-2-English']
+- **Training platform**: Kaggle (CUDA GPU)
+- **Fine-tuned**: On English-Chinese pairs from the Hugging Face dataset
+- **Quantization**: PyTorch Dynamic Quantization (`torch.quantization.quantize_dynamic`)
+- **Tokenizer**: Saved alongside the model
+## 📁 Folder Structure
+quantized_model/
+├── config.json
+├── pytorch_model.bin
+├── tokenizer_config.json
+├── tokenizer.json
+├── vocab.json / merges.txt
+---
+## 🚀 Usage
+### 🔹 1. Load Quantized Model for Inference
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained("./quantized_model")
+# Load quantized model
+model = AutoModelForSeq2SeqLM.from_pretrained("./quantized_model")
+model.eval()
+# Run translation
+translator = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer, device=-1)
+text = "How are you?"
+print("English:", translator(text)[0]['translation_text'])
+```
+## Model Training Summary
+ - Loaded dataset: wlhb/Transaltion-Chinese-2-English
+ - Mapped translation data: {"en": ..., "zh": ...} before training
+ - Training: 3 epochs using GPU
+ -  Disabled: wandb logging
+ - Skipped: Evaluation phase
+ - Saved: Trained + Quantized model and tokenizer
+ - Quantization: torch.quantization.Quantize_dynamic is used for efficient CPU inference