distilbart-xsum-12-6

Maintained By
sshleifer

DistilBART-XSUM-12-6

PropertyValue
Authorsshleifer
Parameter Count306M
Model TypeSummarization
Hugging FaceModel Repository
Inference Time137ms
Speedup vs Baseline1.68x

What is distilbart-xsum-12-6?

DistilBART-XSUM-12-6 is a knowledge-distilled version of the BART model specifically optimized for extreme summarization tasks. It achieves impressive performance with ROUGE-2 and ROUGE-L scores of 22.12 and 36.99 respectively, actually surpassing its larger baseline model while using fewer parameters and offering faster inference.

Implementation Details

The model architecture features 12 encoder layers and 6 decoder layers, hence the "12-6" naming convention. It's designed to be loaded using BartForConditionalGeneration.from_pretrained and represents an optimal balance between model size and performance.

  • 306M parameters (25% reduction from baseline)
  • 1.68x inference speedup compared to BART-large-xsum
  • Optimized for the XSUM dataset
  • 137ms inference time per sample

Core Capabilities

  • Extreme summarization with state-of-the-art performance
  • Efficient inference with reduced computational requirements
  • Superior ROUGE scores compared to baseline model
  • Maintains quality while reducing model size

Frequently Asked Questions

Q: What makes this model unique?

This model achieves the remarkable feat of outperforming its larger parent model while being significantly more efficient. With 306M parameters compared to the baseline's 406M, it delivers better ROUGE scores and 1.68x faster inference.

Q: What are the recommended use cases?

The model is specifically designed for extreme summarization tasks, particularly those similar to the XSUM dataset. It's ideal for applications requiring concise, high-quality summaries while maintaining computational efficiency.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.