StableBeluga2
Property | Value |
---|---|
Base Model | Llama2 70B |
Developer | Stability AI |
License | STABLE BELUGA NON-COMMERCIAL COMMUNITY LICENSE |
Primary Language | English |
Training Datasets | Orca-style Dataset |
What is StableBeluga2?
StableBeluga2 represents Stability AI's advanced language model, built upon the foundation of Llama2 70B architecture and fine-tuned using a specialized Orca-style dataset. The model is designed to excel at instruction-following while maintaining safety and ethical considerations in its responses.
Implementation Details
The model implements a sophisticated training procedure using mixed-precision (BF16) and AdamW optimization. It features a two-phase training approach with specific hyperparameters: Phase 1 uses a batch size of 256 with packed data, while Phase 2 employs a batch size of 512 with unpacked data. Both phases utilize a learning rate of 3e-5 with cosine decay to 3e-6.
- Specialized system prompt format for optimal interaction
- Support for text generation with configurable parameters
- Integration with HuggingFace Transformers library
- Optimized for both CPU and GPU deployment
Core Capabilities
- High-quality text generation and completion
- Instruction following with safety considerations
- Context-aware responses with system prompt integration
- Support for various text generation parameters (top_p, top_k)
Frequently Asked Questions
Q: What makes this model unique?
StableBeluga2 stands out through its combination of Llama2 70B's powerful base architecture and specialized Orca-style dataset training, making it particularly effective at following instructions while maintaining safety guidelines.
Q: What are the recommended use cases?
The model is best suited for research and non-commercial applications requiring sophisticated language understanding and generation, including chatbots, text completion, and assisted writing tasks, while adhering to ethical AI principles.