Mixtral_34Bx2_MoE_60B

Mixtral_34Bx2_MoE_60B

cloudyu

A powerful 60.8B parameter MoE model combining two 34B models, achieving strong performance on various benchmarks with multi-lingual capabilities and efficient architecture.

PropertyValue
Parameter Count60.8B
LicenseApache 2.0
Tensor TypeBF16
ArchitectureMixture of Experts (MoE)

What is Mixtral_34Bx2_MoE_60B?

Mixtral_34Bx2_MoE_60B is a sophisticated Mixture of Experts (MoE) model that combines two 34B models to create a powerful 60.8B parameter language model. It achieved impressive scores on the Open LLM Leaderboard, including a 45.38% accuracy on IFEval and 41.21% on BBH benchmarks. The model supports both English and Chinese languages, making it versatile for multilingual applications.

Implementation Details

The model is built upon two base models: jondurbin/bagel-dpo-34b-v0.2 and SUSTech/SUS-Chat-34B. It utilizes BF16 precision and can be deployed on both GPU and CPU environments with appropriate configurations. The model supports various inference settings and includes built-in safety measures.

  • Supports both GPU and CPU deployment
  • Implements repetition penalty for better output quality
  • Configurable max token generation
  • Built-in tokenizer with customizable system prompts

Core Capabilities

  • Strong performance on multiple benchmarks (76.66 average score on old leaderboard)
  • Excellent results on HellaSwag (85.25%) and Winogrande (84.85%)
  • Capable of handling complex reasoning tasks
  • Bilingual support for English and Chinese
  • Efficient text generation with customizable parameters

Frequently Asked Questions

Q: What makes this model unique?

This model's unique strength lies in its Mixture of Experts architecture, combining two powerful 34B models to create a more capable system. It achieves strong performance across various benchmarks while maintaining multilingual capabilities.

Q: What are the recommended use cases?

The model excels in text generation, reasoning tasks, and multilingual applications. It's particularly well-suited for complex problem-solving, creative writing, and applications requiring both English and Chinese language processing.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026