OpenLLaMA 7B v2
Property | Value |
---|---|
License | Apache 2.0 |
Research Paper | arXiv:2302.13971 |
Training Data | Falcon refined-web, StarCoder, RedPajama |
Framework | PyTorch, Transformers |
What is open_llama_7b_v2?
OpenLLaMA 7B v2 is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. This version represents a significant improvement over the v1 model, trained on a diverse mixture of high-quality datasets. The model serves as a drop-in replacement for the original LLaMA in existing implementations.
Implementation Details
The model is trained using EasyLM, a JAX-based training pipeline, achieving impressive throughput of over 2200 tokens/second/TPU-v4 chip. It implements both normal data parallelism and fully sharded data parallelism (FSDP) for optimal training efficiency.
- Follows identical preprocessing steps and hyperparameters as original LLaMA
- Trained on 1T tokens using cloud TPU-v4s
- Implements both PyTorch and JAX weight formats
- Employs custom tokenizer trained from scratch
Core Capabilities
- Comparable performance to original LLaMA 7B across multiple benchmarks
- Strong results in tasks like PIQA, ARC, and HellaSwag
- Effective for general text generation and completion tasks
- Seamless integration with Hugging Face transformers library
Frequently Asked Questions
Q: What makes this model unique?
OpenLLaMA 7B v2 stands out for its permissive Apache 2.0 license and complete open-source nature, trained entirely on publicly available datasets. It achieves performance comparable to the original LLaMA while being freely available for commercial use.
Q: What are the recommended use cases?
The model is well-suited for various natural language processing tasks, including text generation, completion, and analysis. It's particularly effective for researchers and developers who need a powerful, open-source language model with commercial usage rights.