bart-large-finetuned-filtered-spotify-podcast-summ

Property	Value
Base Model	facebook/bart-large-cnn
License	MIT
Paper	Research Paper
Training Dataset Size	69,336 episodes

What is bart-large-finetuned-filtered-spotify-podcast-summ?

This is a specialized summarization model fine-tuned on the Spotify Podcast Dataset, built upon the BART-large-CNN architecture. It's designed to generate concise, readable summaries of podcast transcripts that help users decide whether to listen to a particular episode. The model achieved a training loss of 2.2967 and validation loss of 2.8316 after 2 epochs of training.

Implementation Details

The model implements a two-stage approach: an extractive module selects important transcript segments, followed by abstractive summarization. It's optimized using AdamWeightDecay with a learning rate of 2e-05 and trained using float32 precision.

Training set: 69,336 episodes
Validation set: 7,705 episodes
Test set: 1,025 episodes
Supports variable length summaries (39-250 tokens)

Core Capabilities

Automatic podcast transcript summarization
Human-readable summary generation
Mobile-friendly output length
Content-faithful summarization

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for podcast content summarization, combining extractive and abstractive techniques to create concise, informative summaries. It's trained on a carefully filtered dataset to ensure high-quality outputs suitable for quick consumption on mobile devices.

Q: What are the recommended use cases?

The model is ideal for automated podcast content previews, content management systems, and podcast platforms looking to provide quick episode overviews. It's particularly suited for scenarios where users need to quickly decide whether to invest time in listening to a full podcast episode.