OLMoE-1B-7B-0125

OLMoE-1B-7B-0125

allenai

OLMoE-1B-7B-0125 is an efficient Mixture-of-Experts LLM with 1B active/7B total parameters, achieving SOTA performance comparable to Llama2-13B.

PropertyValue
Active Parameters1.3B
Total Parameters7B
Paperarxiv.org/abs/2409.02060
LicenseOpen Source
AuthorAllen AI

What is OLMoE-1B-7B-0125?

OLMoE-1B-7B-0125 is a state-of-the-art Mixture-of-Experts (MoE) language model that achieves remarkable efficiency by utilizing 1.3B active parameters while maintaining access to 7B total parameters. This innovative architecture allows it to compete with much larger models like Llama2-13B while maintaining a smaller computational footprint.

Implementation Details

The model is implemented using the Transformers library and can be easily deployed using PyTorch. It features a sophisticated MoE architecture that dynamically routes computations through different expert networks, optimizing both performance and efficiency.

  • Supports both FP32 and BF16 weight formats
  • Multiple checkpoints available for different use cases
  • Comprehensive pretraining with over 5033B tokens
  • Includes specialized versions for instruction-tuning and SFT

Core Capabilities

  • Strong performance on MMLU (56.3%)
  • Excellent results on HellaSwag (81.7%)
  • High accuracy on ARC-Challenge (67.5%)
  • Superior performance compared to other 1B parameter models

Frequently Asked Questions

Q: What makes this model unique?

OLMoE-1B-7B-0125 stands out for its efficient use of the Mixture-of-Experts architecture, achieving performance comparable to much larger models while using only 1.3B active parameters. It's fully open-source and achieves state-of-the-art results in its parameter class.

Q: What are the recommended use cases?

The model is well-suited for general language understanding tasks, particularly excelling in multiple-choice reasoning, common sense understanding, and scientific knowledge. It's ideal for applications requiring high performance with limited computational resources.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026