TeleChat-7B-int4
Property | Value |
---|---|
Model Size | 7B parameters (4-bit quantized) |
License | Apache 2.0 |
Research Paper | arXiv:2401.03804 |
Architecture | Decoder-only Transformer |
What is telechat-7B-int4?
TeleChat-7B-int4 is a 4-bit quantized version of the TeleChat-7B language model, developed by Tele-AI. It's trained on 1.5 trillion tokens of high-quality Chinese and English text, designed to deliver efficient performance while maintaining model quality. This quantized version significantly reduces the model's memory footprint while preserving its capabilities.
Implementation Details
The model implements several architectural innovations including Rotary Embeddings for position encoding, SwiGLU activation functions, and RMSNorm for layer normalization. The base architecture consists of 30 layers with a hidden size of 4096 and 32 attention heads.
- Optimized with FlashAttention v2 for 20% faster training
- Supports context lengths up to 8K tokens, expandable to 96K using NTK-aware scaling
- Implements DeepSpeed for efficient fine-tuning with Zero parallel optimization
Core Capabilities
- Strong performance on Chinese language tasks and benchmarks (62.1% on C-Eval, 64.3% on CMMLU)
- Excels at long-form content generation including work reports, plans, and technical documents
- Multi-turn dialogue support with specialized mask loss training
- Efficient deployment through 4-bit quantization while maintaining performance
Frequently Asked Questions
Q: What makes this model unique?
The model combines extensive Chinese language training (1.5T tokens), efficient quantization, and strong performance on benchmarks with practical capabilities for business document generation and multi-turn dialogue.
Q: What are the recommended use cases?
The model excels at generating long-form content like work summaries, project plans, PPT outlines, technical proposals, and professional emails. It's particularly well-suited for business and technical writing tasks in Chinese contexts.