yayi2-30b

yayi2-30b

wenge-research

YAYI2-30B is a powerful 30B parameter multilingual LLM trained on 2.65T tokens, featuring strong performance across knowledge, math, and code tasks with 80.5% MMLU score.

PropertyValue
Parameter Count30 Billion
ArchitectureTransformer (64 layers, 64 heads)
Context Length4096 tokens
Training Data2.65T tokens (multilingual)
LicenseApache-2.0 (code) / Custom (model)
PaperarXiv:2312.14862

What is YAYI2-30B?

YAYI2-30B is a state-of-the-art large language model developed by Wenge Technology, representing a significant advancement in multilingual AI capabilities. The model is built on a sophisticated Transformer architecture and has been pretrained on an extensive dataset of 2.65 trillion tokens across multiple languages.

Implementation Details

The model features a robust architecture with 64 transformer layers, 64 attention heads, and a hidden size of 7168. It uses a vocabulary size of 81,920 tokens and supports a context length of 4096 tokens. Implementation requires significant computational resources, with a minimum of 80GB GPU memory for inference.

  • Advanced transformer architecture optimized for multilingual processing
  • Comprehensive training across diverse datasets
  • Supports both base and chat versions
  • Implements human feedback reinforcement learning for better alignment

Core Capabilities

  • Strong performance on knowledge benchmarks (80.5% on MMLU)
  • Excellence in mathematical reasoning (71.2% on GSM8K)
  • Superior code generation capabilities (53.1% on HumanEval)
  • Robust multilingual understanding and generation
  • Advanced logical reasoning and problem-solving abilities

Frequently Asked Questions

Q: What makes this model unique?

YAYI2-30B stands out for its exceptional performance across multiple benchmarks, particularly in knowledge testing and mathematical reasoning. It achieves state-of-the-art results among similar-sized models, especially in MMLU (80.5%) and CMMLU (84.0%).

Q: What are the recommended use cases?

The model is well-suited for a wide range of applications including multilingual text generation, mathematical problem-solving, code generation, and complex reasoning tasks. It's particularly effective for scenarios requiring deep knowledge understanding and logical reasoning.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026