yayi2-30b

Maintained By
wenge-research

YAYI2-30B

PropertyValue
Parameter Count30 Billion
ArchitectureTransformer (64 layers, 64 heads)
Context Length4096 tokens
Training Data2.65T tokens (multilingual)
LicenseApache-2.0 (code) / Custom (model)
PaperarXiv:2312.14862

What is YAYI2-30B?

YAYI2-30B is a state-of-the-art large language model developed by Wenge Technology, representing a significant advancement in multilingual AI capabilities. The model is built on a sophisticated Transformer architecture and has been pretrained on an extensive dataset of 2.65 trillion tokens across multiple languages.

Implementation Details

The model features a robust architecture with 64 transformer layers, 64 attention heads, and a hidden size of 7168. It uses a vocabulary size of 81,920 tokens and supports a context length of 4096 tokens. Implementation requires significant computational resources, with a minimum of 80GB GPU memory for inference.

  • Advanced transformer architecture optimized for multilingual processing
  • Comprehensive training across diverse datasets
  • Supports both base and chat versions
  • Implements human feedback reinforcement learning for better alignment

Core Capabilities

  • Strong performance on knowledge benchmarks (80.5% on MMLU)
  • Excellence in mathematical reasoning (71.2% on GSM8K)
  • Superior code generation capabilities (53.1% on HumanEval)
  • Robust multilingual understanding and generation
  • Advanced logical reasoning and problem-solving abilities

Frequently Asked Questions

Q: What makes this model unique?

YAYI2-30B stands out for its exceptional performance across multiple benchmarks, particularly in knowledge testing and mathematical reasoning. It achieves state-of-the-art results among similar-sized models, especially in MMLU (80.5%) and CMMLU (84.0%).

Q: What are the recommended use cases?

The model is well-suited for a wide range of applications including multilingual text generation, mathematical problem-solving, code generation, and complex reasoning tasks. It's particularly effective for scenarios requiring deep knowledge understanding and logical reasoning.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.