bart-large-cnn

facebook

BART-large-CNN: 406M parameter transformer-based summarization model fine-tuned on CNN Daily Mail dataset. Achieves ROUGE-1: 42.95, ROUGE-2: 20.81.

Property	Value
Parameter Count	406M
License	MIT
Author	Facebook
Paper	View Paper
ROUGE-1 Score	42.95

What is bart-large-cnn?

BART-large-cnn is a powerful transformer-based model specifically designed for text summarization tasks. It's built on Facebook's BART architecture, which combines a bidirectional encoder (similar to BERT) with an autoregressive decoder (similar to GPT). This particular model has been fine-tuned on the CNN Daily Mail dataset, making it particularly effective for news summarization tasks.

Implementation Details

The model utilizes a sequence-to-sequence architecture with 406M parameters. It employs a unique pre-training approach that involves corrupting text with noise and then learning to reconstruct the original content. The model has been fine-tuned specifically for summarization tasks and achieves impressive ROUGE scores on the CNN Daily Mail benchmark.

Bidirectional encoder combined with autoregressive decoder
Fine-tuned on CNN Daily Mail dataset
Supports variable length output summaries
Achieves ROUGE-1: 42.95, ROUGE-2: 20.81, ROUGE-L: 30.62

Core Capabilities

Text summarization with high coherence and accuracy
Flexible summary length control through max_length and min_length parameters
Effective handling of long-form articles and news content
Support for both extractive and abstractive summarization

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its specialized fine-tuning on CNN Daily Mail dataset and its impressive ROUGE scores. The combination of BERT-like encoder and GPT-like decoder makes it particularly effective for generating high-quality summaries while maintaining context and coherence.

Q: What are the recommended use cases?

The model excels at summarizing news articles, long-form content, and general text summarization tasks. It's particularly well-suited for applications requiring concise, accurate summaries of longer texts while maintaining key information and context.