zephyr-orpo-141b-A35b-v0.1

Maintained By
HuggingFaceH4

Zephyr ORPO 141B

PropertyValue
Parameter Count141B (39B active)
Base ModelMixtral-8x22B-v0.1
LicenseApache 2.0
Training Time1.3 hours on 4 nodes of 8 x H100s
PaperORPO Paper

What is zephyr-orpo-141b-A35b-v0.1?

Zephyr ORPO 141B is a state-of-the-art language model that represents a significant advancement in AI assistant capabilities. Built on the Mixtral-8x22B architecture, it employs a novel Mixture of Experts (MoE) approach with 141B total parameters, of which 39B are active during inference. The model was fine-tuned using the innovative Odds Ratio Preference Optimization (ORPO) technique on a carefully curated dataset of 7,000 instances.

Implementation Details

The model leverages advanced training methodologies, including BF16 precision and distributed training across multiple H100 GPUs. It achieves impressive benchmark scores, including 8.17 on MT-Bench and 65.06 on IFEval, demonstrating its robust capabilities across various tasks.

  • Trained using ORPO without requiring SFT step
  • Utilizes argilla/distilabel-capybara-dpo-7k-binarized dataset
  • Implements efficient MoE architecture for optimal performance
  • Supports advanced chat capabilities with temperature control

Core Capabilities

  • High-quality conversational AI responses
  • Strong performance on reasoning and evaluation benchmarks
  • Efficient parameter utilization through MoE architecture
  • Comprehensive chat template support
  • Advanced text generation with controllable parameters

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its use of ORPO training methodology, which achieves high performance without requiring a separate SFT step, making it more computationally efficient than traditional methods like DPO and PPO. Additionally, its MoE architecture allows for impressive capabilities while maintaining efficient computation.

Q: What are the recommended use cases?

The model excels in general chat capabilities, code generation, mathematical reasoning, and complex problem-solving tasks. It's particularly well-suited for applications requiring sophisticated language understanding and generation, though users should be aware it lacks specific safety alignments present in commercial models.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.