ZYH-LLM-Qwen2.5-14B-V3

Property	Value
Parameter Count	14 Billion
Model Type	Large Language Model
Base Architecture	Qwen2.5
Hugging Face URL	https://huggingface.co/YOYO-AI/ZYH-LLM-Qwen2.5-14B-V3
IFEval Score	85.78 (0-shot)

What is ZYH-LLM-Qwen2.5-14B-V3?

ZYH-LLM-Qwen2.5-14B-V3 represents the third generation of the ZYH-LLM series, featuring a sophisticated multi-stage model merging approach. As of February 2025, it holds the distinction of achieving the highest IFEval score among 14B parameter models. The model leverages various merging techniques to create a powerful unified system based on the Qwen2.5 architecture.

Implementation Details

The model's development follows a three-stage merging process: First stage involves multiple merges using the della method with Qwen2.5-14B variants and EVA-Qwen2.5-14B models. The second stage incorporates merges with Virtuoso-Small-v2 and Blossom-V6-14B. The final stage uses the model_stock method to combine all previous results.

Uses bfloat16 precision throughout the merging process
Implements normalize and int8_mask parameters for optimization
Employs multiple merge methods including della, sce, and model_stock

Core Capabilities

IFEval (0-Shot): 85.78
BBH (3-Shot): 48.18
MATH Lvl 5 (4-Shot): 52.72
GPQA (0-shot): 10.96
MuSR (0-shot): 9.00
MMLU-PRO (5-shot): 43.12

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its sophisticated multi-stage merging approach, combining the strengths of various Qwen2.5 variants and other models to achieve state-of-the-art performance in IFEval benchmarks.

Q: What are the recommended use cases?

Given its comprehensive evaluation scores, the model is well-suited for general language tasks, particularly those requiring strong zero-shot and few-shot capabilities. It shows particular strength in instruction-following tasks as indicated by its high IFEval score.