OpenLLaMA 3B v2

Property	Value
License	Apache 2.0
Training Data	Falcon refined-web, StarCoder, RedPajama
Model Size	3 Billion parameters
Framework	PyTorch/JAX

What is open_llama_3b_v2?

OpenLLaMA 3B v2 is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. It represents a significant advancement in accessible AI, trained on 1 trillion tokens and designed to serve as a drop-in replacement for the original LLaMA implementation.

Implementation Details

The model is trained using cloud TPU-v4s with EasyLM, achieving over 2200 tokens/second/TPU-v4 chip throughput. It implements a combination of normal data parallelism and fully sharded data parallelism (FSDP/ZeRO stage 3) for optimal performance.

Trained on multiple high-quality datasets including Falcon refined-web, StarCoder, and RedPajama
Follows identical preprocessing steps and hyperparameters as the original LLaMA
Available in both PyTorch and JAX formats

Core Capabilities

General text generation and completion tasks
Competitive performance with original LLaMA on multiple benchmarks
Seamless integration with Hugging Face transformers library
Support for context-aware text generation

Frequently Asked Questions

Q: What makes this model unique?

This model is unique in being an open-source, permissively licensed alternative to LLaMA, trained from scratch on publicly available datasets. It achieves comparable performance to the original while being freely available for commercial use.

Q: What are the recommended use cases?

The model is well-suited for various NLP tasks including text generation, completion, and analysis. It's particularly valuable for researchers and developers who need a powerful language model with permissive licensing for commercial applications.

open_llama_3b_v2