BART Summarisation Model
Property | Value |
---|---|
Author | slauw87 |
License | Apache 2.0 |
Training Dataset | SAMSum |
Base Model | facebook/bart-large-cnn |
What is bart_summarisation?
This is a specialized text summarization model built on the BART architecture, specifically fine-tuned for conversational summarization using the SAMSum dialogue dataset. The model was trained using Amazon SageMaker and achieves impressive ROUGE scores, with a ROUGE-1 score of 43.21 on the validation set.
Implementation Details
The model utilizes the BART-large-CNN architecture as its foundation and implements several optimization techniques during training, including FP16 precision and a learning rate of 5e-05 over 3 training epochs. It was trained with a batch size of 4 and incorporates predict-with-generate capabilities.
- Trained using Amazon SageMaker with Hugging Face Deep Learning container
- Implements FP16 training for improved efficiency
- Optimized hyperparameters for dialogue summarization
- Achieves 43.21 ROUGE-1, 22.35 ROUGE-2, and 33.32 ROUGE-L scores on validation
Core Capabilities
- Conversation and dialogue summarization
- Abstractive text summarization
- Efficient processing with batched inference
- Production-ready with SageMaker deployment support
Frequently Asked Questions
Q: What makes this model unique?
This model specializes in dialogue summarization, specifically trained on the SAMSum dataset, making it particularly effective for summarizing conversations and chat logs. Its integration with Amazon SageMaker makes it suitable for production deployments.
Q: What are the recommended use cases?
The model is ideal for applications requiring conversation summarization, such as chat analysis, meeting summary generation, and customer service interaction summaries. It performs particularly well on dialogue-based content rather than general text summarization.