telechat-7B-int8

telechat-7B-int8

Tele-AI

TeleChat-7B-int8 is an 8-bit quantized Chinese LLM trained on 1.5T tokens, featuring multi-turn dialogue capabilities and long-text generation with strong performance across various benchmarks.

PropertyValue
LicenseApache 2.0
ArchitectureDecoder-only Transformer
Parameters7 Billion (8-bit quantized)
PaperTeleChat Technical Report

What is TeleChat-7B-int8?

TeleChat-7B-int8 is an 8-bit quantized version of the TeleChat-7B model, developed by China Telecom AI Technology Co., Ltd. It's trained on 1.5 trillion tokens of high-quality Chinese and English text, utilizing a Decoder-only architecture with significant improvements in position encoding and activation functions.

Implementation Details

The model implements several technical innovations:

  • Rotary Embedding for position encoding, improving training speed by 20% with Flash-Attention v2 compatibility
  • SwiGLU activation function replacing GELU
  • RMSNorm-based Pre-Normalization
  • 30 layers with 4096 hidden size and 12288 FFN hidden size
  • 32 attention heads

Core Capabilities

  • Multi-turn dialogue support with specialized mask loss training
  • Extended context length up to 96K tokens using NTK-aware extrapolation
  • Strong performance in long-form content generation (work reports, plans, PPT outlines)
  • Competitive benchmark scores across MMLU, C-Eval, CMMLU, and other evaluations
  • Efficient 8-bit quantization for reduced memory footprint

Frequently Asked Questions

Q: What makes this model unique?

TeleChat-7B-int8 stands out for its efficient 8-bit quantization while maintaining strong performance, especially in Chinese language tasks. It features advanced position encoding and specialized training for multi-turn dialogues, making it particularly suitable for real-world applications.

Q: What are the recommended use cases?

The model excels in long-form content generation, including business documents, academic writing, and creative tasks. It's particularly well-suited for applications requiring extended context understanding and multi-turn conversations while maintaining memory efficiency through quantization.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026