distilbart-cnn-12-6-finetuned-weaksup-1000
Property | Value |
---|---|
Base Model | sshleifer/distilbart-cnn-12-6 |
Training Type | Fine-tuned with weak supervision |
Framework | PyTorch 1.10.2, Transformers 4.16.2 |
Model URL | HuggingFace |
What is distilbart-cnn-12-6-finetuned-weaksup-1000?
This model is a fine-tuned version of DistilBART CNN, specifically optimized through weak supervision on 1000 training samples. It represents a carefully tuned variant of the base model, achieving notable ROUGE scores in text summarization tasks.
Implementation Details
The model was trained using the Adam optimizer with carefully selected hyperparameters (learning rate: 2e-05, betas=(0.9,0.999)) and implements native AMP for mixed precision training. The training process involved a single epoch with batch size 1 for both training and evaluation.
- Training utilized a linear learning rate scheduler
- Achieved training loss of 1.644 and validation loss of 1.6818
- Average generation length: 66.44 tokens
Core Capabilities
- ROUGE-1 Score: 25.9199
- ROUGE-2 Score: 11.2697
- ROUGE-L Score: 20.3598
- ROUGE-Lsum Score: 22.8242
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized fine-tuning approach using weak supervision on a specific dataset of 1000 samples, making it particularly adapted for targeted summarization tasks while maintaining reasonable ROUGE scores.
Q: What are the recommended use cases?
While specific use cases aren't detailed in the model card, based on its architecture and metrics, it's suitable for text summarization tasks, particularly when working with news articles or similar content, given its CNN-based architecture.