TeleChat-7B-int8
Property | Value |
---|---|
License | Apache 2.0 |
Architecture | Decoder-only Transformer |
Parameters | 7 Billion (8-bit quantized) |
Paper | TeleChat Technical Report |
What is TeleChat-7B-int8?
TeleChat-7B-int8 is an 8-bit quantized version of the TeleChat-7B model, developed by China Telecom AI Technology Co., Ltd. It's trained on 1.5 trillion tokens of high-quality Chinese and English text, utilizing a Decoder-only architecture with significant improvements in position encoding and activation functions.
Implementation Details
The model implements several technical innovations:
- Rotary Embedding for position encoding, improving training speed by 20% with Flash-Attention v2 compatibility
- SwiGLU activation function replacing GELU
- RMSNorm-based Pre-Normalization
- 30 layers with 4096 hidden size and 12288 FFN hidden size
- 32 attention heads
Core Capabilities
- Multi-turn dialogue support with specialized mask loss training
- Extended context length up to 96K tokens using NTK-aware extrapolation
- Strong performance in long-form content generation (work reports, plans, PPT outlines)
- Competitive benchmark scores across MMLU, C-Eval, CMMLU, and other evaluations
- Efficient 8-bit quantization for reduced memory footprint
Frequently Asked Questions
Q: What makes this model unique?
TeleChat-7B-int8 stands out for its efficient 8-bit quantization while maintaining strong performance, especially in Chinese language tasks. It features advanced position encoding and specialized training for multi-turn dialogues, making it particularly suitable for real-world applications.
Q: What are the recommended use cases?
The model excels in long-form content generation, including business documents, academic writing, and creative tasks. It's particularly well-suited for applications requiring extended context understanding and multi-turn conversations while maintaining memory efficiency through quantization.