legal-bart-summarizer
This is a fine-tuned version of facebook/bart-large, trained specifically to summarize long legal documents using the LegalSum dataset. It’s designed to take dense legal texts and produce clear, concise summaries—especially useful for making legal content more digestible.
About the Model
The base model is facebook/bart-large, a powerful encoder-decoder architecture that works well for sequence-to-sequence tasks like summarization. I fine-tuned it on the full LegalSum dataset, which contains legal documents paired with human-written extractive summaries.
The model handles input sequences up to 1024 tokens and generates summaries capped at 512 tokens. It was trained over 5 epochs using a batch size of 4 and a learning rate of 3e-5, with mixed precision (fp16) to speed things up and save memory.
How to Use
Here’s a quick example using 🤗 Transformers:
from transformers import BartTokenizer, BartForConditionalGeneration
tokenizer = BartTokenizer.from_pretrained("whyredfire/legal-bart-summarizer")
model = BartForConditionalGeneration.from_pretrained("whyredfire/legal-bart-summarizer")
text = "Insert your legal document here..."
inputs = tokenizer([text], max_length=1024, truncation=True, return_tensors="pt")
summary_ids = model.generate(inputs["input_ids"], max_length=512, num_beams=4, early_stopping=True)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
Training Details
- Base model:
facebook/bart-large - Epochs: ~5
- Batch size: 4
- Max input length: 1024
- Max summary length: 512
- Learning rate: 3e-5
- Gradient accumulation: 4 steps
- Warmup ratio: 10%
- Weight decay: 0.01
- Mixed precision: Enabled (fp16)
- Seed: 42
Evaluation Results
- Test loss: 0.98
- Test runtime: ~19.5 seconds
- Samples/sec: ~55.7
- Steps/sec: ~13.96
Evaluated using Hugging Face’s Trainer with ROUGE and similar metrics under the hood.
Use Cases
This model is a good fit if you're working with:
- Legal research tools
- Brief generation for lawyers and law students
- Making court rulings and legal judgments more readable
Limitations
While the model does a decent job on civil law-style documents, it's not bulletproof. It might struggle with out-of-domain legal texts like criminal or tax law and—like most generative models—it can sometimes hallucinate or omit important legal details. Always double-check the output before relying on it for anything serious.
License
Please make sure your use complies with the licensing terms of both the base model (facebook/bart-large) and the LegalSum dataset. This model is shared for research and experimentation purposes.
Reference
If you’re looking for more context on the dataset and task setup, check out the paper:
CivilSum: A Dataset for Abstractive Summarization of Indian Court Decisions https://dl.acm.org/doi/pdf/10.1145/3626772.3657859
- Downloads last month
- 24