flan-t5-base-samsum

Maintained By
philschmid

FLAN-T5-Base-SAMSum

PropertyValue
LicenseApache 2.0
Base Modelgoogle/flan-t5-base
Training DatasetSAMSum
Primary TaskText Summarization

What is flan-t5-base-samsum?

FLAN-T5-Base-SAMSum is a specialized text summarization model fine-tuned on the SAMSum dataset, built upon the powerful FLAN-T5-Base architecture. This model demonstrates impressive performance with a ROUGE1 score of 47.23, making it particularly effective for dialogue and conversation summarization tasks.

Implementation Details

The model was trained using a carefully optimized process with Adam optimizer, utilizing a linear learning rate scheduler and a learning rate of 5e-05. Training was conducted over 5 epochs with batch sizes of 8 for both training and evaluation, achieving consistent improvement in performance metrics throughout the training process.

  • Training conducted over 9,210 steps
  • Achieved final ROUGE1/2/L scores of 47.81/24.00/40.21
  • Implements PyTorch framework with Transformers 4.25.1

Core Capabilities

  • Specialized in dialogue summarization
  • Optimized for text-to-text generation tasks
  • Average generation length of 17.3 tokens
  • Consistent performance across multiple ROUGE metrics

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful FLAN-T5 architecture with specific optimization for dialogue summarization, achieving impressive ROUGE scores while maintaining reasonable output lengths.

Q: What are the recommended use cases?

The model is particularly well-suited for summarizing conversations, chat logs, and dialogue-heavy content, making it ideal for applications in customer service, meeting summarization, and social media content analysis.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.