flan-t5-base-samsum

flan-t5-base-samsum

philschmid

A fine-tuned version of FLAN-T5-base optimized for text summarization, achieving 47.2 ROUGE1 score on the SAMSum dataset with strong dialogue summarization capabilities.

PropertyValue
LicenseApache 2.0
Base Modelgoogle/flan-t5-base
Training DatasetSAMSum
Primary TaskText Summarization

What is flan-t5-base-samsum?

FLAN-T5-Base-SAMSum is a specialized text summarization model fine-tuned on the SAMSum dataset, built upon the powerful FLAN-T5-Base architecture. This model demonstrates impressive performance with a ROUGE1 score of 47.23, making it particularly effective for dialogue and conversation summarization tasks.

Implementation Details

The model was trained using a carefully optimized process with Adam optimizer, utilizing a linear learning rate scheduler and a learning rate of 5e-05. Training was conducted over 5 epochs with batch sizes of 8 for both training and evaluation, achieving consistent improvement in performance metrics throughout the training process.

  • Training conducted over 9,210 steps
  • Achieved final ROUGE1/2/L scores of 47.81/24.00/40.21
  • Implements PyTorch framework with Transformers 4.25.1

Core Capabilities

  • Specialized in dialogue summarization
  • Optimized for text-to-text generation tasks
  • Average generation length of 17.3 tokens
  • Consistent performance across multiple ROUGE metrics

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful FLAN-T5 architecture with specific optimization for dialogue summarization, achieving impressive ROUGE scores while maintaining reasonable output lengths.

Q: What are the recommended use cases?

The model is particularly well-suited for summarizing conversations, chat logs, and dialogue-heavy content, making it ideal for applications in customer service, meeting summarization, and social media content analysis.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026