OLMo-2-1124-7B-DPO

OLMo-2-1124-7B-DPO

allenai

OLMo-2 7B DPO model optimized for chat & tasks like MATH/GSM8K. Apache 2.0 licensed, trained on Tülu 3 dataset with length-normalized DPO approach.

PropertyValue
Base ModelOLMo-2-7B-SFT
LicenseApache 2.0
PaperTülu 3 Paper
Primary LanguageEnglish

What is OLMo-2-1124-7B-DPO?

OLMo-2-1124-7B-DPO is an advanced language model developed by Allen Institute for AI (AI2) as part of their OLMo (Open Language Model) series. This model represents a significant advancement in open-source AI, having undergone Direct Preference Optimization (DPO) training after initial supervised fine-tuning on the Tülu 3 dataset.

Implementation Details

The model utilizes a length-normalized DPO training approach with specific hyperparameters including a learning rate of 8E-7, beta value of 5, and effective batch size of 128. It supports a maximum sequence length of 2048 tokens and employs a linear learning rate schedule with 0.1 warmup ratio.

  • Trained on a custom preference dataset mix
  • Implements chat template with user/assistant format
  • Supports standard HuggingFace transformers pipeline
  • Optimized for both conversational and analytical tasks

Core Capabilities

  • Strong performance on mathematical reasoning (GSM8k: 82.4%)
  • High safety scores (81.5% on safety benchmarks)
  • Effective on general knowledge tasks (MMLU: 63.4%)
  • Improved chat capabilities through DPO training

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its fully open nature and strong performance across diverse tasks, particularly excelling in mathematical reasoning and safety aspects while maintaining competitive scores in general knowledge tasks.

Q: What are the recommended use cases?

This model is particularly well-suited for research and educational applications, especially in scenarios requiring mathematical reasoning, structured problem-solving, and safe conversational interactions. It's designed to handle both chat-based interactions and specific task-oriented applications.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026