LongLM-large

Maintained By
thu-coai

LongLM-large

PropertyValue
Parameter Count993M
Model TypeText-to-Text Generation
ArchitectureT5-based Transformer
PaperarXiv:2108.12960
Tensor TypeFP16

What is LongLM-large?

LongLM-large is a sophisticated Chinese language model developed by thu-coai, specifically designed for long-text understanding and generation. With 993M parameters, it utilizes a T5-based architecture featuring 1,536 dimensional hidden states and 12 attention heads.

Implementation Details

The model employs a unique architecture with 24 encoder layers and 32 decoder layers, using a feed-forward dimension of 3,072 and key/value dimension of 64. It's trained on 120GB of novel data using two primary pretraining tasks: text infilling and conditional continuation.

  • Advanced text infilling with Poisson distribution (λ=3) for masked span lengths
  • 15% masking ratio for original texts
  • Conditional continuation through random text splitting
  • Implementation in PyTorch with Hugging Face Transformers support

Core Capabilities

  • Long-form Chinese text generation
  • Text completion and infilling
  • Conditional text generation
  • Story continuation and narrative generation

Frequently Asked Questions

Q: What makes this model unique?

LongLM-large stands out for its specialized architecture optimized for long-text processing, combining both text infilling and conditional continuation in its pretraining objectives. Its large parameter count and specialized training on novel data make it particularly effective for creative text generation in Chinese.

Q: What are the recommended use cases?

The model is best suited for applications requiring long-form Chinese text generation, such as story continuation, creative writing assistance, and document completion. It's particularly effective for tasks requiring understanding and maintaining context over longer sequences.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.