Pegasus-SAMSum
Property | Value |
---|---|
Author | transformersbook |
Downloads | 79,626 |
Framework | PyTorch |
Base Model | google/pegasus-cnn_dailymail |
What is pegasus-samsum?
Pegasus-SAMSum is a fine-tuned version of the Pegasus model specifically optimized for conversation summarization tasks. Developed as part of the "NLP with Transformers" book project, this model builds upon the google/pegasus-cnn_dailymail architecture and is trained on the SAMSum dataset, which specializes in dialogue summarization.
Implementation Details
The model employs a sophisticated training procedure with carefully selected hyperparameters, including a learning rate of 5e-05 and linear scheduling with 500 warmup steps. Training was conducted using the Adam optimizer with gradient accumulation steps of 16 to achieve an effective batch size of 16, despite training with single-sample batches.
- Achieved validation loss of 1.4875
- Trained using Transformers 4.12.0 and PyTorch 1.9.1
- Implements gradient accumulation for improved training stability
Core Capabilities
- Specialized in conversation and dialogue summarization
- Optimized for efficient text-to-text generation
- Supports TensorBoard integration for training monitoring
- Compatible with Inference Endpoints for deployment
Frequently Asked Questions
Q: What makes this model unique?
This model uniquely combines the powerful Pegasus architecture with specialized training on conversational data, making it particularly effective for summarizing dialogues and conversations - a task that general summarization models often struggle with.
Q: What are the recommended use cases?
The model is best suited for summarizing chat conversations, meeting transcripts, and dialogue-based content where capturing the essence of multi-party interactions is crucial.