pegasus-samsum

Maintained By
transformersbook

Pegasus-SAMSum

PropertyValue
Authortransformersbook
Downloads79,626
FrameworkPyTorch
Base Modelgoogle/pegasus-cnn_dailymail

What is pegasus-samsum?

Pegasus-SAMSum is a fine-tuned version of the Pegasus model specifically optimized for conversation summarization tasks. Developed as part of the "NLP with Transformers" book project, this model builds upon the google/pegasus-cnn_dailymail architecture and is trained on the SAMSum dataset, which specializes in dialogue summarization.

Implementation Details

The model employs a sophisticated training procedure with carefully selected hyperparameters, including a learning rate of 5e-05 and linear scheduling with 500 warmup steps. Training was conducted using the Adam optimizer with gradient accumulation steps of 16 to achieve an effective batch size of 16, despite training with single-sample batches.

  • Achieved validation loss of 1.4875
  • Trained using Transformers 4.12.0 and PyTorch 1.9.1
  • Implements gradient accumulation for improved training stability

Core Capabilities

  • Specialized in conversation and dialogue summarization
  • Optimized for efficient text-to-text generation
  • Supports TensorBoard integration for training monitoring
  • Compatible with Inference Endpoints for deployment

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines the powerful Pegasus architecture with specialized training on conversational data, making it particularly effective for summarizing dialogues and conversations - a task that general summarization models often struggle with.

Q: What are the recommended use cases?

The model is best suited for summarizing chat conversations, meeting transcripts, and dialogue-based content where capturing the essence of multi-party interactions is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.