OpenLLaMA 3B

Property	Value
Parameters	3 Billion
License	Apache 2.0
Training Data	RedPajama-Data-1T
Framework	PyTorch/JAX

What is open_llama_3b?

OpenLLaMA 3B is an open-source reproduction of Meta AI's LLaMA language model, trained on 1 trillion tokens. It represents a significant achievement in democratizing access to large language models, offering comparable performance to the original LLaMA while being freely available under an Apache 2.0 license.

Implementation Details

The model is trained using the EasyLM framework on cloud TPU-v4s, implementing both standard data parallelism and fully sharded data parallelism (FSDP). The architecture follows the original LLaMA design, maintaining identical hyperparameters, context length, and training procedures.

Trained on the RedPajama dataset (1.2 trillion tokens)
Implements identical architecture and training parameters as original LLaMA
Available in both PyTorch and JAX formats
Achieves competitive performance across multiple benchmarks

Core Capabilities

Strong performance in reasoning tasks (ANLI, ARC)
Effective at common sense reasoning (PIQA, HellaSwag)
Competitive accuracy in reading comprehension (ReCoRD)
Matches or exceeds original LLaMA performance in several benchmarks

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 3B is unique in being a fully open-source, permissively licensed reproduction of LLaMA that achieves comparable performance while being trained from scratch on the RedPajama dataset. It removes the licensing restrictions of the original LLaMA model.

Q: What are the recommended use cases?

The model is well-suited for various NLP tasks including question-answering, reasoning, and text generation. It's particularly valuable for researchers and developers who need a powerful, open-source language model they can freely modify and distribute.

open_llama_3b