long-t5-tglobal-base-16384-book-summary
Property | Value |
---|---|
Parameter Count | 248M |
License | BSD-3-Clause |
Paper | LongT5 Paper |
Max Input Length | 16,384 tokens |
ROUGE-1 Score | 36.41 |
What is long-t5-tglobal-base-16384-book-summary?
This is a specialized variant of Google's LongT5 model, fine-tuned specifically for generating high-quality summaries of lengthy texts. Built on the long-t5-tglobal-base architecture, it's been extensively trained on the BookSum dataset for over 30 epochs using V100/A100 GPUs. The model can process inputs up to 16,384 tokens while generating summaries up to 1,024 tokens.
Implementation Details
The model leverages advanced transformer architecture with optimizations for handling long sequences. It uses beam search for generation with parameters like no_repeat_ngram_size=3 and repetition_penalty=3.5 to ensure high-quality, non-repetitive summaries.
- Trained using 16,384 token input windows with 1,024 token maximum output
- Implements efficient batch processing for very long documents
- Supports both academic and narrative text summarization
Core Capabilities
- Long-form document summarization with strong ROUGE scores
- Handles multiple document types (academic papers, books, lectures)
- Maintains factual consistency better than previous models
- Efficient processing of documents exceeding 30k tokens through batching
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to handle 16,384 tokens while maintaining high-quality summarization sets it apart. It's specifically optimized for book-length content and shows strong performance across various text types.
Q: What are the recommended use cases?
Ideal for summarizing long academic texts, books, lectures, and technical documents. It's particularly effective for creating SparkNotes-style summaries and processing lengthy research papers.