Pegasus-SAMSum

Property	Value
Author	transformersbook
Downloads	79,626
Framework	PyTorch
Base Model	google/pegasus-cnn_dailymail

What is pegasus-samsum?

Pegasus-SAMSum is a fine-tuned version of the Pegasus model specifically optimized for conversation summarization tasks. Developed as part of the "NLP with Transformers" book project, this model builds upon the google/pegasus-cnn_dailymail architecture and is trained on the SAMSum dataset, which specializes in dialogue summarization.

Implementation Details

The model employs a sophisticated training procedure with carefully selected hyperparameters, including a learning rate of 5e-05 and linear scheduling with 500 warmup steps. Training was conducted using the Adam optimizer with gradient accumulation steps of 16 to achieve an effective batch size of 16, despite training with single-sample batches.

Achieved validation loss of 1.4875
Trained using Transformers 4.12.0 and PyTorch 1.9.1
Implements gradient accumulation for improved training stability

Core Capabilities

Specialized in conversation and dialogue summarization
Optimized for efficient text-to-text generation
Supports TensorBoard integration for training monitoring
Compatible with Inference Endpoints for deployment

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines the powerful Pegasus architecture with specialized training on conversational data, making it particularly effective for summarizing dialogues and conversations - a task that general summarization models often struggle with.

Q: What are the recommended use cases?

The model is best suited for summarizing chat conversations, meeting transcripts, and dialogue-based content where capturing the essence of multi-party interactions is crucial.

pegasus-samsum