StableLM 2 Zephyr 1.6B

Property	Value
Parameter Count	1.64B
License	StabilityAI Non-Commercial Research
Paper	Technical Report
Training Data	8 datasets including UltraChat, MetaMathQA
MT-Bench Score	5.42

What is stablelm-2-zephyr-1_6b?

StableLM 2 Zephyr 1.6B is an advanced language model developed by Stability AI, designed specifically for instruction-following and conversational tasks. It represents a significant achievement in creating efficient, smaller-scale models that maintain impressive performance. The model was trained using Direct Preference Optimization (DPO) and builds upon the architecture of HuggingFaceH4's Zephyr 7B training pipeline.

Implementation Details

The model is implemented as an auto-regressive language model based on the transformer decoder architecture. It utilizes FP16 precision and was trained on 8 nodes with 8 A100 80GB GPUs each. The training process incorporated both supervised fine-tuning and preference learning through DPO.

Trained on a diverse mix of 8 high-quality datasets
Implements chat template for structured conversations
Supports max_new_tokens up to 1024
Optimized for English language tasks

Core Capabilities

Strong performance on MT-Bench (5.42 score)
49.89% average score on OpenLLM Leaderboard
Efficient resource utilization with 1.6B parameters
Specialized in instruction-following and chat applications

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for achieving impressive performance metrics despite its relatively small size of 1.6B parameters, making it more accessible for deployment while maintaining good capabilities in instruction-following tasks.

Q: What are the recommended use cases?

The model is best suited for chat-like applications and general instruction-following tasks. However, it requires proper input/output safeguards and shouldn't be used without appropriate content filtering due to potential hallucination risks.