distilbart-cnn-12-6-finetuned-weaksup-1000

Property	Value
Base Model	sshleifer/distilbart-cnn-12-6
Training Type	Fine-tuned with weak supervision
Framework	PyTorch 1.10.2, Transformers 4.16.2
Model URL	HuggingFace

What is distilbart-cnn-12-6-finetuned-weaksup-1000?

This model is a fine-tuned version of DistilBART CNN, specifically optimized through weak supervision on 1000 training samples. It represents a carefully tuned variant of the base model, achieving notable ROUGE scores in text summarization tasks.

Implementation Details

The model was trained using the Adam optimizer with carefully selected hyperparameters (learning rate: 2e-05, betas=(0.9,0.999)) and implements native AMP for mixed precision training. The training process involved a single epoch with batch size 1 for both training and evaluation.

Training utilized a linear learning rate scheduler
Achieved training loss of 1.644 and validation loss of 1.6818
Average generation length: 66.44 tokens

Core Capabilities

ROUGE-1 Score: 25.9199
ROUGE-2 Score: 11.2697
ROUGE-L Score: 20.3598
ROUGE-Lsum Score: 22.8242

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized fine-tuning approach using weak supervision on a specific dataset of 1000 samples, making it particularly adapted for targeted summarization tasks while maintaining reasonable ROUGE scores.

Q: What are the recommended use cases?

While specific use cases aren't detailed in the model card, based on its architecture and metrics, it's suitable for text summarization tasks, particularly when working with news articles or similar content, given its CNN-based architecture.