open_llama_3b

Maintained By
openlm-research

OpenLLaMA 3B

PropertyValue
Parameters3 Billion
LicenseApache 2.0
Training DataRedPajama-Data-1T
FrameworkPyTorch/JAX

What is open_llama_3b?

OpenLLaMA 3B is an open-source reproduction of Meta AI's LLaMA language model, trained on 1 trillion tokens. It represents a significant achievement in democratizing access to large language models, offering comparable performance to the original LLaMA while being freely available under an Apache 2.0 license.

Implementation Details

The model is trained using the EasyLM framework on cloud TPU-v4s, implementing both standard data parallelism and fully sharded data parallelism (FSDP). The architecture follows the original LLaMA design, maintaining identical hyperparameters, context length, and training procedures.

  • Trained on the RedPajama dataset (1.2 trillion tokens)
  • Implements identical architecture and training parameters as original LLaMA
  • Available in both PyTorch and JAX formats
  • Achieves competitive performance across multiple benchmarks

Core Capabilities

  • Strong performance in reasoning tasks (ANLI, ARC)
  • Effective at common sense reasoning (PIQA, HellaSwag)
  • Competitive accuracy in reading comprehension (ReCoRD)
  • Matches or exceeds original LLaMA performance in several benchmarks

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 3B is unique in being a fully open-source, permissively licensed reproduction of LLaMA that achieves comparable performance while being trained from scratch on the RedPajama dataset. It removes the licensing restrictions of the original LLaMA model.

Q: What are the recommended use cases?

The model is well-suited for various NLP tasks including question-answering, reasoning, and text generation. It's particularly valuable for researchers and developers who need a powerful, open-source language model they can freely modify and distribute.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.