Randeng-Pegasus-238M-Summary-Chinese

Randeng-Pegasus-238M-Summary-Chinese

IDEA-CCNL

Chinese PEGASUS model (238M params) fine-tuned on 7 Chinese summarization datasets. Achieves ROUGE-1/2/L scores of 43.46/29.59/39.76 on LCSTS benchmark. Optimized for abstractive summarization.

PropertyValue
Parameter Count238M
Model TypeAbstractive Summarization
Base ArchitecturePEGASUS
Reference PaperPEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Training Datasets7 Chinese datasets (4M samples)

What is Randeng-Pegasus-238M-Summary-Chinese?

This is a specialized Chinese language model based on the PEGASUS architecture, specifically optimized for text summarization tasks. It has been fine-tuned on approximately 4 million samples across 7 diverse Chinese datasets, including education, news, social media, and general content sources.

Implementation Details

The model is built on the Randeng-Pegasus-238M-Chinese base and has been extensively fine-tuned on datasets including education, new2016zh, nlpcc, shence, sohu, thucnews, and weibo. It demonstrates strong performance on the LCSTS benchmark with ROUGE scores of 43.46/29.59/39.76 for ROUGE-1/2/L respectively.

  • 238M parameter architecture optimized for Chinese text
  • Implements PEGASUS's gap-sentence generation pre-training objective
  • Specialized tokenizer for Chinese language processing

Core Capabilities

  • Abstractive text summarization for Chinese content
  • Efficient processing of long-form Chinese text
  • Generation of concise, coherent summaries
  • Support for various content domains (news, education, social media)

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Chinese text summarization, combining the powerful PEGASUS architecture with extensive fine-tuning on diverse Chinese datasets. Its optimization across multiple domains makes it particularly versatile for different summarization applications.

Q: What are the recommended use cases?

The model is ideal for automated news summarization, content condensation, and document abstracting in Chinese. It's particularly effective for applications requiring concise summaries of longer Chinese texts while maintaining coherence and key information.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026