OLMo-1B-hf

OLMo-1B-hf

allenai

An open-source 1.18B parameter language model from Allen AI, trained on 3T tokens. Features strong performance for its size and Apache 2.0 license.

PropertyValue
Parameter Count1.18B
Training Tokens3 Trillion
Context Length2048
LicenseApache 2.0
PaperarXiv:2402.00838

What is OLMo-1B-hf?

OLMo-1B-hf is part of the Open Language Model (OLMo) series developed by Allen AI to advance language model science. This Hugging Face-compatible version features 16 layers, 2048 hidden size, and 16 attention heads, trained on the comprehensive Dolma dataset.

Implementation Details

The model utilizes a modern Transformer architecture with several optimizations:

  • Non-parametric LayerNorm and RoPE positional embeddings
  • Full attention mechanism with sequential block type
  • SwiGLU activation function
  • Training optimized with AdamW (lr=4.0E-4, weight decay=0.1)

Core Capabilities

  • Strong performance on core NLP tasks (62.42% average across standard benchmarks)
  • Competitive with larger models on some tasks
  • Efficient text generation with support for various sampling methods
  • Easy integration with Hugging Face Transformers library

Frequently Asked Questions

Q: What makes this model unique?

OLMo-1B-hf stands out for its complete transparency in training data, methodology, and evaluation metrics. It achieves impressive performance for its size class, particularly in tasks like COPA (79%) and PIQA (73.7%).

Q: What are the recommended use cases?

The model is well-suited for research purposes, text generation tasks, and as a foundation for fine-tuning on specific applications. It's particularly effective for tasks requiring strong reasoning capabilities within its 2048 token context window.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026