StableLM 2 Zephyr 1.6B
Property | Value |
---|---|
Parameter Count | 1.64B |
License | StabilityAI Non-Commercial Research |
Paper | Technical Report |
Training Data | 8 datasets including UltraChat, MetaMathQA |
MT-Bench Score | 5.42 |
What is stablelm-2-zephyr-1_6b?
StableLM 2 Zephyr 1.6B is an advanced language model developed by Stability AI, designed specifically for instruction-following and conversational tasks. It represents a significant achievement in creating efficient, smaller-scale models that maintain impressive performance. The model was trained using Direct Preference Optimization (DPO) and builds upon the architecture of HuggingFaceH4's Zephyr 7B training pipeline.
Implementation Details
The model is implemented as an auto-regressive language model based on the transformer decoder architecture. It utilizes FP16 precision and was trained on 8 nodes with 8 A100 80GB GPUs each. The training process incorporated both supervised fine-tuning and preference learning through DPO.
- Trained on a diverse mix of 8 high-quality datasets
- Implements chat template for structured conversations
- Supports max_new_tokens up to 1024
- Optimized for English language tasks
Core Capabilities
- Strong performance on MT-Bench (5.42 score)
- 49.89% average score on OpenLLM Leaderboard
- Efficient resource utilization with 1.6B parameters
- Specialized in instruction-following and chat applications
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for achieving impressive performance metrics despite its relatively small size of 1.6B parameters, making it more accessible for deployment while maintaining good capabilities in instruction-following tasks.
Q: What are the recommended use cases?
The model is best suited for chat-like applications and general instruction-following tasks. However, it requires proper input/output safeguards and shouldn't be used without appropriate content filtering due to potential hallucination risks.