OLMo-7B-0724-hf

Maintained By
allenai

OLMo-7B-0724-hf

PropertyValue
Parameter Count7 Billion
Training Tokens2.75 Trillion
LicenseApache 2.0
Context Length4096 tokens
DeveloperAllen Institute for AI (AI2)

What is OLMo-7B-0724-hf?

OLMo-7B-0724-hf is an open-source large language model developed by Allen Institute for AI as part of their initiative to advance the science of language models. Built on a 32-layer transformer architecture with 4096 hidden size and 32 attention heads, this model represents a significant advancement in open-source AI research.

Implementation Details

The model employs a sophisticated two-stage training approach: initial training on the Dolma 1.7 dataset with a cosine learning rate schedule, followed by fine-tuning on a high-quality subset. It utilizes non-parametric LayerNorm, RoPE positional embeddings, and full attention without biases.

  • Architecture: 32 layers, 4096 hidden size, 32 attention heads
  • Training: 2.75 trillion tokens on Dolma dataset
  • Optimizer: AdamW with peak learning rate of 3.0E-4
  • Context window: 4096 tokens

Core Capabilities

  • Strong performance on GSM8k (35% accuracy)
  • Competitive MMLU performance (53.4%)
  • Excellent scientific QA capabilities (97% on SciQ)
  • Robust general reasoning and comprehension

Frequently Asked Questions

Q: What makes this model unique?

OLMo-7B stands out for its completely open approach to model development, with all training code, checkpoints, and logs being publicly available. Its two-stage training process and focus on scientific tasks make it particularly valuable for research applications.

Q: What are the recommended use cases?

The model excels in scientific reasoning, mathematical problem-solving, and general language understanding tasks. It's particularly suitable for research applications and as a foundation for further fine-tuning in specialized domains.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.