File size: 2,559 Bytes
12bf9d5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
---
language:
- en
license: apache-2.0
base_model: google/flan-t5-base
tags:
- text2text-generation
- summarization
- xsum
- lora
- peft
datasets:
- EdinburghNLP/xsum
metrics:
- rouge
---
# FLAN-T5-Base Fine-tuned on XSum with LoRA
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the [XSum dataset](https://huggingface.co/datasets/EdinburghNLP/xsum) using **LoRA (Low-Rank Adaptation)** for parameter-efficient fine-tuning.
## Model Description
- **Base Model:** google/flan-t5-base
- **Task:** Extreme Summarization (one-sentence summaries)
- **Dataset:** XSum (BBC news articles)
- **Training Method:** LoRA (Low-Rank Adaptation)
- **Parameters:** 0.00M trainable (0.00% of 249.35M total)
## Training Details
### LoRA Configuration
- **Rank (r):** 16
- **Alpha:** 32
- **Target modules:** q, v
- **Dropout:** 0.05
### Training Hyperparameters
- **Learning rate:** 3e-4
- **Batch size:** 8
- **Epochs:** 3
- **Optimizer:** AdamW
- **Mixed precision:** FP16
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")
tokenizer = AutoTokenizer.from_pretrained("AmanSrivastava80815/flan-t5-base-xsum-lora")
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "AmanSrivastava80815/flan-t5-base-xsum-lora")
# Generate summary
text = "Your article text here..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=64, num_beams=4, length_penalty=2.0)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
```
## Performance
Evaluation metrics on XSum test set:
- **ROUGE-1:** [Add your score]
- **ROUGE-2:** [Add your score]
- **ROUGE-L:** [Add your score]
## Citation
If you use this model, please cite the original FLAN-T5 paper and the XSum dataset:
```bibtex
@article{chung2022scaling,
title={Scaling instruction-finetuned language models},
author={Chung, Hyung Won and others},
journal={arXiv preprint arXiv:2210.11416},
year={2022}
}
@inproceedings{narayan2018don,
title={Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization},
author={Narayan, Shashi and others},
booktitle={EMNLP},
year={2018}
}
```
## License
This model inherits the license from the base model: Apache 2.0
---
**Trained by:** AmanSrivastava80815
|