Qwen2.5-7B-Instruct-Merge-Stock-v0.1
Property | Value |
---|---|
Base Model | Qwen/Qwen2.5-7B-Instruct |
Parameters | 7 Billion |
Model Type | Instruction-tuned Language Model |
Hugging Face | Link |
What is Qwen2.5-7B-Instruct-Merge-Stock-v0.1?
This model is an innovative merge of multiple Qwen2.5-based models using the Model Stock merge method. It combines the strengths of various models including Qwen-2.5-7B-R1-Stock, Qwen2.5-Dyanka-7B-Preview, Clarus-7B-v0.3, and QandoraExp-7B to create a powerful instruction-tuned language model with enhanced reasoning capabilities.
Implementation Details
The model utilizes a sophisticated merge configuration with bfloat16 precision and inherits its tokenizer from Qwen/Qwen2.5-7B-Instruct. A unique feature is its thinking mode implementation, where reasoning processes are wrapped within <think> tags before providing answers.
- Merged using Model Stock methodology
- Implements specialized thinking mode for structured reasoning
- Combines five different model variants with optimized weights
- Uses bfloat16 data type for efficient processing
Core Capabilities
- Strong performance in IFEval with 75.09% accuracy (0-shot)
- Capable mathematical reasoning with 48.94% on MATH Lvl 5 (4-shot)
- Balanced performance across multiple benchmarks with 36.14% average score
- Enhanced reasoning capabilities through structured thinking approach
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its merge methodology combining multiple specialized variants of Qwen2.5, along with its structured thinking mode that explicitly separates reasoning from responses.
Q: What are the recommended use cases?
This model is particularly well-suited for tasks requiring careful reasoning, instruction following, and mathematical problem-solving, as evidenced by its strong performance on IFEval and MATH benchmarks.