OLMo-7B-0724-hf

Property	Value
Parameter Count	7 Billion
Training Tokens	2.75 Trillion
License	Apache 2.0
Context Length	4096 tokens
Developer	Allen Institute for AI (AI2)

What is OLMo-7B-0724-hf?

OLMo-7B-0724-hf is an open-source large language model developed by Allen Institute for AI as part of their initiative to advance the science of language models. Built on a 32-layer transformer architecture with 4096 hidden size and 32 attention heads, this model represents a significant advancement in open-source AI research.

Implementation Details

The model employs a sophisticated two-stage training approach: initial training on the Dolma 1.7 dataset with a cosine learning rate schedule, followed by fine-tuning on a high-quality subset. It utilizes non-parametric LayerNorm, RoPE positional embeddings, and full attention without biases.

Architecture: 32 layers, 4096 hidden size, 32 attention heads
Training: 2.75 trillion tokens on Dolma dataset
Optimizer: AdamW with peak learning rate of 3.0E-4
Context window: 4096 tokens

Core Capabilities

Strong performance on GSM8k (35% accuracy)
Competitive MMLU performance (53.4%)
Excellent scientific QA capabilities (97% on SciQ)
Robust general reasoning and comprehension

Frequently Asked Questions

Q: What makes this model unique?

OLMo-7B stands out for its completely open approach to model development, with all training code, checkpoints, and logs being publicly available. Its two-stage training process and focus on scientific tasks make it particularly valuable for research applications.

Q: What are the recommended use cases?

The model excels in scientific reasoning, mathematical problem-solving, and general language understanding tasks. It's particularly suitable for research applications and as a foundation for further fine-tuning in specialized domains.

OLMo-7B-0724-hf

OLMo-7B-0724-hf

What is OLMo-7B-0724-hf?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models