CPM-Generate

Maintained By
TsinghuaAI

CPM-Generate

PropertyValue
Parameters2.6 Billion
LicenseMIT
Training Data100GB Chinese corpus
PaperView Research Paper

What is CPM-Generate?

CPM-Generate is a state-of-the-art Chinese language model developed by TsinghuaAI, representing one of the largest Chinese pre-trained language models available. Built on the Transformer architecture, it leverages 2.6 billion parameters trained on a diverse 100GB corpus of Chinese text, including encyclopedia entries, webpages, stories, news, and dialogues.

Implementation Details

The model utilizes a dense attention mechanism with a maximum sequence length of 1,024 tokens. Training was conducted over 20,000 steps using 64 NVIDIA V100 GPUs, with the first 5,000 steps dedicated to warm-up. The model employs the Adam optimizer with a learning rate of 1.5×10^-4 and a batch size of 3,072.

  • Architecture: Transformer-based autoregressive language model
  • Training Data Distribution: Encyclopedia (40GB), Webpage (39GB), Story (10GB), News (10GB), Dialog (1GB)
  • Available Variants: Small (109M params), Medium (334M params), Large (2.6B params)

Core Capabilities

  • Text Generation and Completion
  • Zero-shot Text Classification
  • Chinese Idiom Cloze Tests
  • Conversational Response Generation
  • Few-shot Learning Tasks

Frequently Asked Questions

Q: What makes this model unique?

CPM-Generate stands out for being one of the largest Chinese language models, demonstrating superior performance in zero-shot and few-shot learning scenarios across various NLP tasks. Its comprehensive training data spanning multiple domains makes it particularly versatile for Chinese language processing tasks.

Q: What are the recommended use cases?

The model excels in text generation, conversation systems, essay writing, cloze tests, and language understanding tasks. It's particularly effective for applications requiring few-shot learning capabilities in Chinese language processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.