distilbart-cnn-12-6-finetuned-weaksup-1000

Maintained By
cammy

distilbart-cnn-12-6-finetuned-weaksup-1000

PropertyValue
Base Modelsshleifer/distilbart-cnn-12-6
Training TypeFine-tuned with weak supervision
FrameworkPyTorch 1.10.2, Transformers 4.16.2
Model URLHuggingFace

What is distilbart-cnn-12-6-finetuned-weaksup-1000?

This model is a fine-tuned version of DistilBART CNN, specifically optimized through weak supervision on 1000 training samples. It represents a carefully tuned variant of the base model, achieving notable ROUGE scores in text summarization tasks.

Implementation Details

The model was trained using the Adam optimizer with carefully selected hyperparameters (learning rate: 2e-05, betas=(0.9,0.999)) and implements native AMP for mixed precision training. The training process involved a single epoch with batch size 1 for both training and evaluation.

  • Training utilized a linear learning rate scheduler
  • Achieved training loss of 1.644 and validation loss of 1.6818
  • Average generation length: 66.44 tokens

Core Capabilities

  • ROUGE-1 Score: 25.9199
  • ROUGE-2 Score: 11.2697
  • ROUGE-L Score: 20.3598
  • ROUGE-Lsum Score: 22.8242

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized fine-tuning approach using weak supervision on a specific dataset of 1000 samples, making it particularly adapted for targeted summarization tasks while maintaining reasonable ROUGE scores.

Q: What are the recommended use cases?

While specific use cases aren't detailed in the model card, based on its architecture and metrics, it's suitable for text summarization tasks, particularly when working with news articles or similar content, given its CNN-based architecture.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.