TeleChat-7B-int4

Property	Value
Model Size	7B parameters (4-bit quantized)
License	Apache 2.0
Research Paper	arXiv:2401.03804
Architecture	Decoder-only Transformer

What is telechat-7B-int4?

TeleChat-7B-int4 is a 4-bit quantized version of the TeleChat-7B language model, developed by Tele-AI. It's trained on 1.5 trillion tokens of high-quality Chinese and English text, designed to deliver efficient performance while maintaining model quality. This quantized version significantly reduces the model's memory footprint while preserving its capabilities.

Implementation Details

The model implements several architectural innovations including Rotary Embeddings for position encoding, SwiGLU activation functions, and RMSNorm for layer normalization. The base architecture consists of 30 layers with a hidden size of 4096 and 32 attention heads.

Optimized with FlashAttention v2 for 20% faster training
Supports context lengths up to 8K tokens, expandable to 96K using NTK-aware scaling
Implements DeepSpeed for efficient fine-tuning with Zero parallel optimization

Core Capabilities

Strong performance on Chinese language tasks and benchmarks (62.1% on C-Eval, 64.3% on CMMLU)
Excels at long-form content generation including work reports, plans, and technical documents
Multi-turn dialogue support with specialized mask loss training
Efficient deployment through 4-bit quantization while maintaining performance

Frequently Asked Questions

Q: What makes this model unique?

The model combines extensive Chinese language training (1.5T tokens), efficient quantization, and strong performance on benchmarks with practical capabilities for business document generation and multi-turn dialogue.

Q: What are the recommended use cases?

The model excels at generating long-form content like work summaries, project plans, PPT outlines, technical proposals, and professional emails. It's particularly well-suited for business and technical writing tasks in Chinese contexts.

telechat-7B-int4