Randeng-T5-77M

Property	Value
Parameter Count	77M
Model Type	T5-based NLT Model
Author	IDEA-CCNL
Training Data	WuDao Corpora (180GB)
GitHub	Fengshenbang-LM

What is Randeng-T5-77M?

Randeng-T5-77M is a specialized Chinese version of mT5-small, specifically designed for Natural Language Translation (NLT) tasks. This model represents a significant adaptation of the original mT5 architecture, optimized for Chinese language processing through innovative training approaches and architectural modifications.

Implementation Details

The model implements several key technical innovations in its training process. It utilizes Corpus-Adaptive Pre-Training (CAPT) technology on the extensive WuDao Corpora (180GB version). To optimize training efficiency, the team specifically retained only the Chinese and English vocabulary from the original T5 tokenizer (sentence piece), significantly streamlining the model's focus.

Training Infrastructure: 8 A100 GPUs for approximately 24 hours
Pre-training Objective: Span corruption
Framework: Fengshen framework
Vocabulary: Optimized for Chinese-English

Core Capabilities

Specialized Chinese language processing
Efficient natural language translation tasks
Optimized performance for Chinese-English applications
Streamlined vocabulary for improved processing speed

Frequently Asked Questions

Q: What makes this model unique?

The model's unique value lies in its specialized optimization for Chinese language processing, combined with its efficient implementation using CAPT technology and focused vocabulary selection.

Q: What are the recommended use cases?

Randeng-T5-77M is particularly well-suited for Chinese natural language translation tasks, text generation, and other NLP applications requiring strong Chinese language understanding capabilities.

Randeng-T5-77M

Randeng-T5-77M

What is Randeng-T5-77M?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models