OLMo-2-1124-13B-Instruct

allenai

OLMo-2 13B instruct model by AllenAI - Open language model with 13.7B params, trained on Tülu 3 dataset for chat and math tasks. Apache 2.0 licensed.

Property	Value
Parameter Count	13.7B
License	Apache 2.0
Paper	Link
Base Model	OLMo-2-13B-1124-DPO

What is OLMo-2-1124-13B-Instruct?

OLMo-2-1124-13B-Instruct is an advanced language model developed by Allen Institute for AI as part of their OLMo (Open Language Model) series. This model represents a significant advancement in open-source AI, featuring 13.7B parameters and specialized training for instruction-following capabilities. It's built upon the base OLMo-2 13B model and has undergone multiple training phases including supervised fine-tuning on the Tülu 3 dataset, DPO training, and RLVR training.

Implementation Details

The model implements a sophisticated architecture trained specifically for instruction-following tasks. It uses BF16 tensor type and includes a specific chat template format for consistent interaction. The training process involved multiple stages, culminating in RLVR training with carefully tuned hyperparameters including a 4×10⁻⁷ learning rate and 2,048 token length capacity.

Specialized training on math, reasoning, and general instruction-following tasks
Implements advanced PPO settings for RLVR training
Supports both chat and task-specific applications
Includes built-in chat templating system

Core Capabilities

Strong performance on mathematical reasoning (87.4% on GSM8k)
Advanced instruction following and chat capabilities
High safety scores (77.5% on safety benchmarks)
Robust performance across multiple evaluation metrics including MMLU (68.6%)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its fully open nature and strong performance across diverse tasks, particularly in mathematics and reasoning. It's also notable for its transparent training process and comprehensive documentation.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, general instruction following, and chat applications. It's particularly well-suited for research and educational purposes, with specific strength in tasks requiring logical reasoning and mathematical computation.