DistilBART CNN 6-6
Property | Value |
---|---|
License | Apache 2.0 |
Parameters | 230M |
Speedup vs Baseline | 2.09x |
ROUGE-2 Score | 20.17 |
ROUGE-L Score | 29.70 |
What is distilbart-cnn-6-6?
DistilBART CNN 6-6 is a compressed version of the BART model specifically optimized for text summarization tasks. It was trained on the CNN/DailyMail dataset and represents a careful balance between performance and efficiency, achieving significant speed improvements while maintaining strong summarization capabilities.
Implementation Details
The model utilizes knowledge distillation techniques to compress the original BART architecture while preserving its summarization abilities. With 230M parameters, it achieves a 2.09x speedup compared to the baseline BART-large-cnn model, with an inference time of 182ms.
- Optimized architecture with 6-6 configuration (6 encoder and 6 decoder layers)
- Trained on CNN/DailyMail dataset for news summarization
- Implements BartForConditionalGeneration architecture
Core Capabilities
- Fast and efficient text summarization
- Strong performance on news article summarization
- Maintains competitive ROUGE scores (ROUGE-2: 20.17, ROUGE-L: 29.70)
- Suitable for production deployment with reduced computational requirements
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its optimal balance between speed and performance, offering a 2.09x speedup over the baseline BART model while maintaining strong summarization capabilities. Its 6-6 architecture represents an efficient compromise between model size and performance.
Q: What are the recommended use cases?
The model is particularly well-suited for news article summarization, content condensation, and applications requiring efficient text summarization with reasonable latency requirements. It's ideal for production environments where processing speed is crucial but quality cannot be significantly compromised.