bart-large-cnn-samsum

Maintained By
philschmid

bart-large-cnn-samsum

PropertyValue
LicenseMIT
Training FrameworkPyTorch
Base Modelfacebook/bart-large-cnn
TaskConversation Summarization

What is bart-large-cnn-samsum?

bart-large-cnn-samsum is a specialized dialogue summarization model built on BART architecture, fine-tuned specifically for conversation summarization using the SAMSum dataset. The model was trained using Amazon SageMaker and achieves impressive ROUGE scores, with a ROUGE-1 score of 41.32 on the test set.

Implementation Details

The model was trained using specific hyperparameters including a learning rate of 5e-05, 3 training epochs, and FP16 precision. It leverages the robust BART-large-CNN architecture and is optimized for generating concise, accurate summaries of conversational text.

  • Trained with batch size of 4 for both training and evaluation
  • Implements predict_with_generate functionality
  • Uses seed 7 for reproducibility
  • Supports mixed precision training with FP16

Core Capabilities

  • Achieves 41.32 ROUGE-1, 20.87 ROUGE-2, and 32.13 ROUGE-L scores on test set
  • Specialized in summarizing dialogue and conversation texts
  • Supports batch processing and efficient inference
  • Optimized for deployment on Amazon SageMaker

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in conversation summarization, differentiating it from general text summarization models. It's specifically optimized for dialogue contexts and achieves strong performance metrics on the SAMSum dataset.

Q: What are the recommended use cases?

The model is ideal for summarizing chat conversations, meeting transcripts, customer service interactions, and any form of dialogue-based content. It's particularly well-suited for integration with Amazon SageMaker deployments.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.