open_llama_3b_v2

Maintained By
openlm-research

OpenLLaMA 3B v2

PropertyValue
LicenseApache 2.0
Training DataFalcon refined-web, StarCoder, RedPajama
Model Size3 Billion parameters
FrameworkPyTorch/JAX

What is open_llama_3b_v2?

OpenLLaMA 3B v2 is a permissively licensed open-source reproduction of Meta AI's LLaMA language model. It represents a significant advancement in accessible AI, trained on 1 trillion tokens and designed to serve as a drop-in replacement for the original LLaMA implementation.

Implementation Details

The model is trained using cloud TPU-v4s with EasyLM, achieving over 2200 tokens/second/TPU-v4 chip throughput. It implements a combination of normal data parallelism and fully sharded data parallelism (FSDP/ZeRO stage 3) for optimal performance.

  • Trained on multiple high-quality datasets including Falcon refined-web, StarCoder, and RedPajama
  • Follows identical preprocessing steps and hyperparameters as the original LLaMA
  • Available in both PyTorch and JAX formats

Core Capabilities

  • General text generation and completion tasks
  • Competitive performance with original LLaMA on multiple benchmarks
  • Seamless integration with Hugging Face transformers library
  • Support for context-aware text generation

Frequently Asked Questions

Q: What makes this model unique?

This model is unique in being an open-source, permissively licensed alternative to LLaMA, trained from scratch on publicly available datasets. It achieves comparable performance to the original while being freely available for commercial use.

Q: What are the recommended use cases?

The model is well-suited for various NLP tasks including text generation, completion, and analysis. It's particularly valuable for researchers and developers who need a powerful language model with permissive licensing for commercial applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.