OpenLLaMA 7B

Property	Value
License	Apache 2.0
Training Data	RedPajama-Data-1T
Parameters	7 Billion
Training Tokens	1 Trillion

What is open_llama_7b?

OpenLLaMA 7B is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. Developed by researchers at Berkeley AI Research, it represents a significant milestone in making large language models more accessible to the broader AI community. The model is trained on the RedPajama dataset, achieving performance comparable to the original LLaMA while being freely available under the Apache 2.0 license.

Implementation Details

The model is trained using cloud TPU-v4s with the EasyLM framework, implementing a combination of normal data parallelism and fully sharded data parallelism (FSDP). It achieves an impressive throughput of over 2200 tokens/second/TPU-v4 chip.

Follows identical architecture and hyperparameters as original LLaMA
Trained on RedPajama dataset (1.2 trillion tokens)
Available in both PyTorch and JAX formats
Implements custom tokenizer trained from scratch

Core Capabilities

Strong performance on various benchmark tasks, including ANLI, ARC, HellaSwag, and PIQA
Matches or exceeds original LLaMA performance on several metrics
Achieves average score of 0.55 across 21 evaluation tasks
Supports both few-shot and zero-shot inference

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 7B stands out for being a fully open-source reproduction of LLaMA with comparable performance, trained from scratch on the RedPajama dataset. Its Apache 2.0 license makes it particularly valuable for both research and commercial applications.

Q: What are the recommended use cases?

The model is suitable for a wide range of natural language processing tasks, including text generation, question answering, and reasoning. It's particularly useful for researchers and developers who need a powerful language model with permissive licensing terms.

open_llama_7b