Qwen-72B-Chat

Qwen-72B-Chat

Qwen

Powerful 72B parameter chat model with 32k context length, supporting Chinese/English/code tasks. Strong performance across benchmarks with quantization options for efficient deployment.

PropertyValue
Parameter Count72.3B
Context Length32,768 tokens
LicenseTongyi Qianwen License
PaperTechnical Report

What is Qwen-72B-Chat?

Qwen-72B-Chat is a large language model developed by Alibaba Cloud, featuring 72.3 billion parameters and trained on over 3 trillion tokens. It's designed as a versatile AI assistant supporting multiple languages, particularly excelling in Chinese and English tasks, with strong capabilities in code generation and mathematical reasoning.

Implementation Details

The model is built on a Transformer architecture with 80 layers, 64 attention heads, and a model dimension of 8192. It implements modern architectural choices including RoPE positional encoding, SwiGLU activation functions, and RMSNorm. The tokenizer utilizes a comprehensive 151,851-token vocabulary optimized for multilingual processing.

  • Supports multiple precision options: BF16, Int8, and Int4 quantization
  • Requires minimum 144GB GPU memory for BF16/FP16 or 48GB for Int4
  • Compatible with both Hugging Face Transformers and vLLM deployment

Core Capabilities

  • Achieves 80.1% accuracy on C-Eval and 74.3% on MMLU (zero-shot)
  • 64.6% pass rate on HumanEval coding tasks
  • 76.4% accuracy on GSM8K mathematical reasoning
  • Handles 32k context length with strong performance on long-context tasks
  • Supports system prompts for role-playing and task customization

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its comprehensive multilingual vocabulary, extensive training data (3+ trillion tokens), and strong performance across diverse tasks while maintaining efficient deployment options through quantization.

Q: What are the recommended use cases?

Qwen-72B-Chat excels in multilingual conversations, complex reasoning, code generation, and mathematical problem-solving. It's particularly suitable for applications requiring long context understanding and detailed technical discussions.

Socials
Integrations
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026