open_llama_13b

Maintained By
openlm-research

OpenLLaMA 13B

PropertyValue
LicenseApache 2.0
Training DataRedPajama-Data-1T
FrameworkPyTorch, JAX
Authoropenlm-research

What is open_llama_13b?

OpenLLaMA 13B is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. It's trained on 1 trillion tokens from the RedPajama dataset and achieves performance comparable to the original LLaMA model across various benchmarks.

Implementation Details

The model follows the exact same architecture and training hyperparameters as the original LLaMA, trained using cloud TPU-v4s with EasyLM framework. It implements both normal data parallelism and fully sharded data parallelism (ZeRO stage 3) for optimal training efficiency.

  • Trained on RedPajama dataset (1.2 trillion tokens)
  • Identical architecture to original LLaMA
  • Supports both PyTorch and JAX frameworks
  • Available through Hugging Face Transformers library

Core Capabilities

  • Strong performance on tasks like ARC, PIQA, and ANLI
  • Matches or exceeds original LLaMA performance on several benchmarks
  • Achieves 91% accuracy on RECORD evaluation
  • Effective at both few-shot and zero-shot tasks

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 13B stands out for being a fully open-source, Apache 2.0 licensed alternative to the original LLaMA model, trained completely from scratch including the tokenizer. It achieves comparable performance while being freely available for commercial use.

Q: What are the recommended use cases?

The model is suitable for various natural language processing tasks including question-answering, text completion, and reasoning tasks. It's particularly effective for applications requiring strong performance on academic benchmarks and general language understanding.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.