Tencent-Hunyuan-Large

Maintained By
tencent

Tencent-Hunyuan-Large

PropertyValue
Total Parameters389 billion
Active Parameters52 billion
LicenseTencent License
PaperarXiv:2411.02265
Maximum Context Length256K tokens (Pretrain), 128K tokens (Instruct)

What is Tencent-Hunyuan-Large?

Tencent-Hunyuan-Large is currently the largest open-source Transformer-based Mixture of Experts (MoE) model in the industry. It represents a significant advancement in efficient large language model design, utilizing innovative architecture to achieve superior performance while optimizing computational resources.

Implementation Details

The model employs several cutting-edge techniques to achieve its impressive performance:

  • Sophisticated KV Cache Compression using Grouped Query Attention (GQA) and Cross-Layer Attention (CLA)
  • Expert-Specific Learning Rate Scaling for optimized training
  • High-quality synthetic data enhancement for improved generalization
  • Advanced long-context processing capabilities

Core Capabilities

  • Exceptional performance on MMLU (89.9% for instruct version)
  • Superior mathematical reasoning (77.4% on MATH dataset)
  • Strong multilingual capabilities, particularly in Chinese language tasks
  • Robust performance in commonsense reasoning and knowledge-based tasks

Frequently Asked Questions

Q: What makes this model unique?

The model's MoE architecture with 389B total parameters but only 52B active parameters makes it highly efficient while maintaining SOTA performance. Its innovative KV cache compression and expert-specific learning techniques set it apart from traditional models.

Q: What are the recommended use cases?

The model excels in diverse applications including complex reasoning, mathematical problem-solving, multilingual tasks, and long-context processing. It's particularly strong in academic and knowledge-intensive applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.