Erlangshen-DeBERTa-v2-320M-Chinese

Maintained By
IDEA-CCNL

Erlangshen-DeBERTa-v2-320M-Chinese

PropertyValue
Parameter Count320M
LicenseApache 2.0
Training DataWuDao Corpora (180GB)
ArchitectureDeBERTa-v2 with Whole Word Masking
PaperFengshenbang 1.0

What is Erlangshen-DeBERTa-v2-320M-Chinese?

Erlangshen-DeBERTa-v2-320M-Chinese is a sophisticated Chinese language model based on the DeBERTa-v2 architecture, specifically designed for natural language understanding (NLU) tasks. Trained on the extensive WuDao Corpora, this model incorporates whole word masking techniques to better handle Chinese language characteristics.

Implementation Details

The model was trained using 8 A100 GPUs (80GB each) over approximately 7 days using the Fengshen framework. It implements the DeBERTa architecture's disentangled attention mechanism while being optimized for Chinese language processing.

  • Utilizes whole word masking for improved semantic understanding
  • Trained on 180GB WuDao Corpora
  • Implements advanced disentangled attention mechanisms
  • Optimized for Chinese language processing

Core Capabilities

  • Strong performance on AFQMC (74.98% accuracy)
  • Effective on TNEWS1.1 (58.17% accuracy)
  • Robust CMNLI performance (83.01% accuracy)
  • Superior OCNLI results (80.22% accuracy)

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful DeBERTa-v2 architecture with specific optimizations for Chinese language processing, including whole word masking and training on a massive Chinese corpus. It achieves superior performance compared to similar-sized models on various NLU benchmarks.

Q: What are the recommended use cases?

The model excels in Chinese natural language understanding tasks, particularly in text classification, natural language inference, and semantic similarity assessment. It's ideal for applications requiring deep understanding of Chinese text semantics.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.