OpenLLaMA 13B

Property	Value
License	Apache 2.0
Training Data	RedPajama-Data-1T
Framework	PyTorch, JAX
Author	openlm-research

What is open_llama_13b?

OpenLLaMA 13B is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. It's trained on 1 trillion tokens from the RedPajama dataset and achieves performance comparable to the original LLaMA model across various benchmarks.

Implementation Details

The model follows the exact same architecture and training hyperparameters as the original LLaMA, trained using cloud TPU-v4s with EasyLM framework. It implements both normal data parallelism and fully sharded data parallelism (ZeRO stage 3) for optimal training efficiency.

Trained on RedPajama dataset (1.2 trillion tokens)
Identical architecture to original LLaMA
Supports both PyTorch and JAX frameworks
Available through Hugging Face Transformers library

Core Capabilities

Strong performance on tasks like ARC, PIQA, and ANLI
Matches or exceeds original LLaMA performance on several benchmarks
Achieves 91% accuracy on RECORD evaluation
Effective at both few-shot and zero-shot tasks

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 13B stands out for being a fully open-source, Apache 2.0 licensed alternative to the original LLaMA model, trained completely from scratch including the tokenizer. It achieves comparable performance while being freely available for commercial use.

Q: What are the recommended use cases?

The model is suitable for various natural language processing tasks including question-answering, text completion, and reasoning tasks. It's particularly effective for applications requiring strong performance on academic benchmarks and general language understanding.

open_llama_13b