CPM-Generate

CPM-Generate

TsinghuaAI

CPM-Generate is a 2.6B parameter Chinese language model trained on 100GB of diverse text data, capable of text generation, classification, and conversation tasks with strong few-shot performance.

PropertyValue
Parameters2.6 Billion
LicenseMIT
Training Data100GB Chinese corpus
PaperView Research Paper

What is CPM-Generate?

CPM-Generate is a state-of-the-art Chinese language model developed by TsinghuaAI, representing one of the largest Chinese pre-trained language models available. Built on the Transformer architecture, it leverages 2.6 billion parameters trained on a diverse 100GB corpus of Chinese text, including encyclopedia entries, webpages, stories, news, and dialogues.

Implementation Details

The model utilizes a dense attention mechanism with a maximum sequence length of 1,024 tokens. Training was conducted over 20,000 steps using 64 NVIDIA V100 GPUs, with the first 5,000 steps dedicated to warm-up. The model employs the Adam optimizer with a learning rate of 1.5×10^-4 and a batch size of 3,072.

  • Architecture: Transformer-based autoregressive language model
  • Training Data Distribution: Encyclopedia (40GB), Webpage (39GB), Story (10GB), News (10GB), Dialog (1GB)
  • Available Variants: Small (109M params), Medium (334M params), Large (2.6B params)

Core Capabilities

  • Text Generation and Completion
  • Zero-shot Text Classification
  • Chinese Idiom Cloze Tests
  • Conversational Response Generation
  • Few-shot Learning Tasks

Frequently Asked Questions

Q: What makes this model unique?

CPM-Generate stands out for being one of the largest Chinese language models, demonstrating superior performance in zero-shot and few-shot learning scenarios across various NLP tasks. Its comprehensive training data spanning multiple domains makes it particularly versatile for Chinese language processing tasks.

Q: What are the recommended use cases?

The model excels in text generation, conversation systems, essay writing, cloze tests, and language understanding tasks. It's particularly effective for applications requiring few-shot learning capabilities in Chinese language processing.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026