Qwen2.5-Dyanka-7B-Preview
Property | Value |
---|---|
Base Model | Qwen2.5-7B |
Parameter Count | 7B |
Merge Method | TIES |
Model Type | Merged LLM |
HuggingFace | Link |
What is Qwen2.5-Dyanka-7B-Preview?
Qwen2.5-Dyanka-7B-Preview is an innovative merged language model that combines the strengths of multiple Qwen2.5-based models using the TIES merge method. Built on gz987/qwen2.5-7b-cabs-v0.3 as its foundation, this model integrates capabilities from five different models with carefully calibrated weights and densities.
Implementation Details
The model employs a sophisticated merge configuration using the TIES method, with each contributing model weighted at 0.2 density and weight. The implementation uses bfloat16 precision and includes int8 masking for optimization. The merge architecture carefully preserves the best aspects of each constituent model while maintaining computational efficiency.
- Base model: gz987/qwen2.5-7b-cabs-v0.3
- Merged with five specialized models including Clarus-7B, THREADRIPPER-Small, and Rombos-LLM
- Optimized using bfloat16 dtype and int8 masking
Core Capabilities
- Strong performance in IF-Eval with 76.40% accuracy
- Competitive MATH Level 5 performance at 48.79%
- Balanced capabilities across various tasks with 37.30% average performance
- Enhanced reasoning capabilities demonstrated through BBH (36.62%)
Frequently Asked Questions
Q: What makes this model unique?
This model's uniqueness lies in its carefully orchestrated merge of six different models, each contributing specific strengths while maintaining a balanced 0.2 weight distribution. The TIES merge method ensures coherent integration of these capabilities.
Q: What are the recommended use cases?
Based on its benchmark performance, this model is particularly well-suited for inference tasks (IF-Eval), mathematical reasoning, and general language understanding. It shows strong capabilities in both zero-shot and few-shot scenarios.