TeleChat-7B-int8

Property	Value
License	Apache 2.0
Architecture	Decoder-only Transformer
Parameters	7 Billion (8-bit quantized)
Paper	TeleChat Technical Report

What is TeleChat-7B-int8?

TeleChat-7B-int8 is an 8-bit quantized version of the TeleChat-7B model, developed by China Telecom AI Technology Co., Ltd. It's trained on 1.5 trillion tokens of high-quality Chinese and English text, utilizing a Decoder-only architecture with significant improvements in position encoding and activation functions.

Implementation Details

The model implements several technical innovations:

Rotary Embedding for position encoding, improving training speed by 20% with Flash-Attention v2 compatibility
SwiGLU activation function replacing GELU
RMSNorm-based Pre-Normalization
30 layers with 4096 hidden size and 12288 FFN hidden size
32 attention heads

Core Capabilities

Multi-turn dialogue support with specialized mask loss training
Extended context length up to 96K tokens using NTK-aware extrapolation
Strong performance in long-form content generation (work reports, plans, PPT outlines)
Competitive benchmark scores across MMLU, C-Eval, CMMLU, and other evaluations
Efficient 8-bit quantization for reduced memory footprint

Frequently Asked Questions

Q: What makes this model unique?

TeleChat-7B-int8 stands out for its efficient 8-bit quantization while maintaining strong performance, especially in Chinese language tasks. It features advanced position encoding and specialized training for multi-turn dialogues, making it particularly suitable for real-world applications.

Q: What are the recommended use cases?

The model excels in long-form content generation, including business documents, academic writing, and creative tasks. It's particularly well-suited for applications requiring extended context understanding and multi-turn conversations while maintaining memory efficiency through quantization.

telechat-7B-int8