OLMo-2-0325-32B

Maintained By
allenai

OLMo-2-0325-32B

PropertyValue
Parameter Count32 Billion
Training Tokens6 Trillion
LicenseApache 2.0
Research PaperarXiv:2501.00656
Context Length4096 tokens

What is OLMo-2-0325-32B?

OLMo-2-0325-32B is the largest model in the OLMo 2 family, developed by Allen Institute for AI. It's a Transformer-style autoregressive language model trained on the OLMo-mix-1124 dataset with 6 trillion tokens of training data. The model features 64 layers, 5120 hidden size, and 40 attention heads, making it a powerful competitor in the open-source AI landscape.

Implementation Details

The model underwent a sophisticated two-stage training process. Stage 1 involved pretraining on 6 trillion tokens (approximately 1.5 epochs), while Stage 2 included fine-tuning on the Dolmino-Mix-1124 dataset. The final model is a merge of multiple training runs, including three versions trained on 100B tokens and one version on 300B tokens.

  • Architecture: 64 layers, 5120 hidden size, 40 attention heads
  • Context window: 4096 tokens
  • Training FLOPs: 1.3 × 10^24
  • Average benchmark score: 72.9 across standard evaluations

Core Capabilities

  • Strong performance in complex reasoning tasks (90.4% on ARC/C)
  • Robust natural language understanding (89.7% on WinoG)
  • Advanced mathematical reasoning (78.8% on GSM8k)
  • Competitive question-answering abilities (88.0% on TriviaQA)

Frequently Asked Questions

Q: What makes this model unique?

OLMo-2-0325-32B stands out for its fully open-source nature, comprehensive documentation, and competitive performance against partially open and closed models. It achieves state-of-the-art results among fully open models across multiple benchmarks.

Q: What are the recommended use cases?

The model excels in various tasks including complex reasoning, mathematical problem-solving, and question-answering. It's particularly suitable for research applications and can be fine-tuned for specific tasks using the provided training scripts and documentation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.