Zephyr ORPO 141B

Property	Value
Parameter Count	141B (39B active)
Base Model	Mixtral-8x22B-v0.1
License	Apache 2.0
Training Time	1.3 hours on 4 nodes of 8 x H100s
Paper	ORPO Paper

What is zephyr-orpo-141b-A35b-v0.1?

Zephyr ORPO 141B is a state-of-the-art language model that represents a significant advancement in AI assistant capabilities. Built on the Mixtral-8x22B architecture, it employs a novel Mixture of Experts (MoE) approach with 141B total parameters, of which 39B are active during inference. The model was fine-tuned using the innovative Odds Ratio Preference Optimization (ORPO) technique on a carefully curated dataset of 7,000 instances.

Implementation Details

The model leverages advanced training methodologies, including BF16 precision and distributed training across multiple H100 GPUs. It achieves impressive benchmark scores, including 8.17 on MT-Bench and 65.06 on IFEval, demonstrating its robust capabilities across various tasks.

Trained using ORPO without requiring SFT step
Utilizes argilla/distilabel-capybara-dpo-7k-binarized dataset
Implements efficient MoE architecture for optimal performance
Supports advanced chat capabilities with temperature control

Core Capabilities

High-quality conversational AI responses
Strong performance on reasoning and evaluation benchmarks
Efficient parameter utilization through MoE architecture
Comprehensive chat template support
Advanced text generation with controllable parameters

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its use of ORPO training methodology, which achieves high performance without requiring a separate SFT step, making it more computationally efficient than traditional methods like DPO and PPO. Additionally, its MoE architecture allows for impressive capabilities while maintaining efficient computation.

Q: What are the recommended use cases?

The model excels in general chat capabilities, code generation, mathematical reasoning, and complex problem-solving tasks. It's particularly well-suited for applications requiring sophisticated language understanding and generation, though users should be aware it lacks specific safety alignments present in commercial models.

zephyr-orpo-141b-A35b-v0.1