OpenLLaMA 7B
Property | Value |
---|---|
License | Apache 2.0 |
Training Data | RedPajama-Data-1T |
Parameters | 7 Billion |
Training Tokens | 1 Trillion |
What is open_llama_7b?
OpenLLaMA 7B is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. Developed by researchers at Berkeley AI Research, it represents a significant milestone in making large language models more accessible to the broader AI community. The model is trained on the RedPajama dataset, achieving performance comparable to the original LLaMA while being freely available under the Apache 2.0 license.
Implementation Details
The model is trained using cloud TPU-v4s with the EasyLM framework, implementing a combination of normal data parallelism and fully sharded data parallelism (FSDP). It achieves an impressive throughput of over 2200 tokens/second/TPU-v4 chip.
- Follows identical architecture and hyperparameters as original LLaMA
- Trained on RedPajama dataset (1.2 trillion tokens)
- Available in both PyTorch and JAX formats
- Implements custom tokenizer trained from scratch
Core Capabilities
- Strong performance on various benchmark tasks, including ANLI, ARC, HellaSwag, and PIQA
- Matches or exceeds original LLaMA performance on several metrics
- Achieves average score of 0.55 across 21 evaluation tasks
- Supports both few-shot and zero-shot inference
Frequently Asked Questions
Q: What makes this model unique?
OpenLLaMA 7B stands out for being a fully open-source reproduction of LLaMA with comparable performance, trained from scratch on the RedPajama dataset. Its Apache 2.0 license makes it particularly valuable for both research and commercial applications.
Q: What are the recommended use cases?
The model is suitable for a wide range of natural language processing tasks, including text generation, question answering, and reasoning. It's particularly useful for researchers and developers who need a powerful language model with permissive licensing terms.