Randeng-DELLA-CVAE-226M-NER-Chinese

Maintained By
IDEA-CCNL

Randeng-DELLA-CVAE-226M-NER-Chinese

PropertyValue
Parameter Count226M
Model TypeConditional Variational Autoencoder (CVAE)
ArchitectureGPT-2 based encoder-decoder
Research PaperLink to Paper
Training DataWudao dataset + NER fine-tuning

What is Randeng-DELLA-CVAE-226M-NER-Chinese?

This is a specialized Chinese language model that combines deep variational autoencoding with controlled text generation capabilities. Initially pretrained on the comprehensive Wudao dataset and subsequently fine-tuned for Named Entity Recognition (NER) tasks, it excels at generating contextually appropriate sentences containing specified named entities and their types.

Implementation Details

The model implements a unique architecture where both encoder and decoder utilize GPT-2 components. Unlike the original DELLA paper implementation, it employs a simplified approach to information fusion, using linear transformation and element-wise addition instead of low-rank-tensor-product, which has proven more stable for open-domain pretraining.

  • Layer-wise recurrent latent variables structure
  • Modified information fusion mechanism for improved stability
  • Specialized tokenization with support for entity markers
  • 226 million trainable parameters

Core Capabilities

  • Generate coherent Chinese text containing specified named entities
  • Control generation through entity type specifications
  • Handle multiple entity constraints simultaneously
  • Support for various entity types including locations and temporal expressions

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to generate Chinese text while maintaining control over the inclusion of specific named entities. Its modified architecture provides better stability for open-domain applications while preserving the benefits of variational modeling.

Q: What are the recommended use cases?

The model is ideal for applications requiring controlled text generation in Chinese, such as automated content creation with specific entity requirements, data augmentation for NER tasks, and generating context-rich examples for language learning or testing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.