ZYH-LLM-Qwen2.5-14B-V3
Property | Value |
---|---|
Parameter Count | 14 Billion |
Model Type | Large Language Model |
Base Architecture | Qwen2.5 |
Hugging Face URL | https://huggingface.co/YOYO-AI/ZYH-LLM-Qwen2.5-14B-V3 |
IFEval Score | 85.78 (0-shot) |
What is ZYH-LLM-Qwen2.5-14B-V3?
ZYH-LLM-Qwen2.5-14B-V3 represents the third generation of the ZYH-LLM series, featuring a sophisticated multi-stage model merging approach. As of February 2025, it holds the distinction of achieving the highest IFEval score among 14B parameter models. The model leverages various merging techniques to create a powerful unified system based on the Qwen2.5 architecture.
Implementation Details
The model's development follows a three-stage merging process: First stage involves multiple merges using the della method with Qwen2.5-14B variants and EVA-Qwen2.5-14B models. The second stage incorporates merges with Virtuoso-Small-v2 and Blossom-V6-14B. The final stage uses the model_stock method to combine all previous results.
- Uses bfloat16 precision throughout the merging process
- Implements normalize and int8_mask parameters for optimization
- Employs multiple merge methods including della, sce, and model_stock
Core Capabilities
- IFEval (0-Shot): 85.78
- BBH (3-Shot): 48.18
- MATH Lvl 5 (4-Shot): 52.72
- GPQA (0-shot): 10.96
- MuSR (0-shot): 9.00
- MMLU-PRO (5-shot): 43.12
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its sophisticated multi-stage merging approach, combining the strengths of various Qwen2.5 variants and other models to achieve state-of-the-art performance in IFEval benchmarks.
Q: What are the recommended use cases?
Given its comprehensive evaluation scores, the model is well-suited for general language tasks, particularly those requiring strong zero-shot and few-shot capabilities. It shows particular strength in instruction-following tasks as indicated by its high IFEval score.