OpenLLaMA 7B v2

Property	Value
License	Apache 2.0
Research Paper	arXiv:2302.13971
Training Data	Falcon refined-web, StarCoder, RedPajama
Framework	PyTorch, Transformers

What is open_llama_7b_v2?

OpenLLaMA 7B v2 is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. This version represents a significant improvement over the v1 model, trained on a diverse mixture of high-quality datasets. The model serves as a drop-in replacement for the original LLaMA in existing implementations.

Implementation Details

The model is trained using EasyLM, a JAX-based training pipeline, achieving impressive throughput of over 2200 tokens/second/TPU-v4 chip. It implements both normal data parallelism and fully sharded data parallelism (FSDP) for optimal training efficiency.

Follows identical preprocessing steps and hyperparameters as original LLaMA
Trained on 1T tokens using cloud TPU-v4s
Implements both PyTorch and JAX weight formats
Employs custom tokenizer trained from scratch

Core Capabilities

Comparable performance to original LLaMA 7B across multiple benchmarks
Strong results in tasks like PIQA, ARC, and HellaSwag
Effective for general text generation and completion tasks
Seamless integration with Hugging Face transformers library

Frequently Asked Questions

Q: What makes this model unique?

OpenLLaMA 7B v2 stands out for its permissive Apache 2.0 license and complete open-source nature, trained entirely on publicly available datasets. It achieves performance comparable to the original LLaMA while being freely available for commercial use.

Q: What are the recommended use cases?

The model is well-suited for various natural language processing tasks, including text generation, completion, and analysis. It's particularly effective for researchers and developers who need a powerful, open-source language model with commercial usage rights.

open_llama_7b_v2