PEGASUS-BillSum
Property | Value |
---|---|
Base Model | google/pegasus-large |
Training Data | BillSum Dataset |
Model Hub | HuggingFace |
ROUGE-1 Score | 56.87 |
What is pegasus-billsum?
PEGASUS-BillSum is a specialized text summarization model fine-tuned on the BillSum dataset for generating concise summaries of legislative bills. Built upon the powerful PEGASUS-large architecture, this model has been optimized specifically for handling complex legislative language and producing accurate, coherent summaries.
Implementation Details
The model was trained using transformers v4.13 with specific optimizations. Training was conducted over 6.6 epochs (12,000 steps) using Adafactor optimizer with a learning rate of 2e-4 and label smoothing of 0.1. The model processes input texts up to 1024 tokens and generates summaries up to 256 tokens.
- Uses beam search with 8 beams for generation
- Trained with batch size of 2 per device across multiple GPUs
- Achieves impressive ROUGE scores: ROUGE-1: 56.87, ROUGE-2: 38.65, ROUGE-L: 44.84
Core Capabilities
- Efficient summarization of legislative documents
- Handles complex legal terminology and structure
- Generates concise, accurate summaries maintaining key information
- Processes long documents up to 1024 tokens
Frequently Asked Questions
Q: What makes this model unique?
This model specializes in legislative text summarization, with optimized performance on the BillSum dataset and impressive ROUGE scores, making it particularly effective for summarizing legal documents and bills.
Q: What are the recommended use cases?
The model is best suited for summarizing legislative bills, legal documents, and policy papers. It's particularly useful for legal professionals, policy analysts, and researchers who need to quickly digest lengthy legislative texts.