OLMo-2-1124-13B

Maintained By
allenai

OLMo-2-1124-13B

PropertyValue
Parameter Count13.7B
Training Tokens5 Trillion
Context Length4096
LicenseApache 2.0
LanguageEnglish

What is OLMo-2-1124-13B?

OLMo-2-1124-13B is a state-of-the-art language model developed by Allen AI as part of their Open Language Model (OLMo) initiative. This 13B parameter model represents a significant advancement in open-source AI, trained on an impressive 5 trillion tokens and designed to compete with leading models in the field.

Implementation Details

The model features a sophisticated architecture with 40 layers, 5120 hidden size, and 40 attention heads. It underwent a two-stage training process: initial pretraining on OLMo-Mix-1124 (1.2 epochs) followed by fine-tuning on Dolmino-Mix-1124 dataset. The final model is the result of merging multiple training runs, including three 100B token versions and one 300B token version.

  • Advanced model architecture with 40 transformer layers
  • Comprehensive training on diverse high-quality datasets
  • Supports various quantization options for optimal performance
  • 4096 token context window

Core Capabilities

  • Competitive performance with leading models on English academic benchmarks
  • Strong results in tasks like ARC, MMLU, and TriviaQA
  • Achieves 68.3% average score across major benchmarks
  • Specialized performance in mathematical reasoning and natural language understanding

Frequently Asked Questions

Q: What makes this model unique?

OLMo-2-1124-13B stands out for its fully open-source nature combined with state-of-the-art performance. It's trained on a carefully curated dataset mix and offers complete transparency in its training process and model architecture.

Q: What are the recommended use cases?

The model excels in academic and research applications, particularly in tasks requiring deep language understanding, mathematical reasoning, and complex problem-solving. It's suitable for both research purposes and practical applications in natural language processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.