led-large-book-summary

Maintained By
pszemraj

led-large-book-summary

PropertyValue
Parameter Count460M
LicenseBSD-3-Clause
PaperBookSum Paper
Max Input Length16,384 tokens

What is led-large-book-summary?

led-large-book-summary is a powerful text summarization model based on the LED (Longformer Encoder-Decoder) architecture, specifically designed for handling long documents. Fine-tuned on the BookSum dataset, this model excels at generating concise summaries while maintaining coherence across lengthy texts up to 16,384 tokens.

Implementation Details

The model underwent extensive training across 13+ epochs on the BookSum dataset, with careful hyperparameter tuning throughout different training stages. It utilizes advanced features like encoder_no_repeat_ngram_size and global attention masks to produce high-quality summaries.

  • Trained using transformers 4.19.2 and PyTorch 1.11.0
  • Implements sophisticated beam search with num_beams=4
  • Uses repetition penalty of 3.5 to ensure diverse outputs
  • Supports variable length summaries with configurable min/max lengths

Core Capabilities

  • Long document processing up to 16K tokens
  • Achieves ROUGE-1 scores of 31.73 on BookSum test set
  • Handles various text types including academic papers, books, and articles
  • Efficient token batching for processing lengthy documents

Frequently Asked Questions

Q: What makes this model unique?

This model's ability to handle extremely long documents (up to 16,384 tokens) while maintaining coherent summaries sets it apart. It's specifically optimized for book-length content and academic materials, making it ideal for research and content summarization tasks.

Q: What are the recommended use cases?

The model is particularly well-suited for summarizing academic papers, book chapters, long-form articles, and research documents. It's optimized for cases where maintaining context across long passages is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.