FuseO1-QwQ-DeepSeekR1-LightR1-32B
Property | Value |
---|---|
Model Size | 32B parameters |
Base Models | Qwen/QwQ-32B, DeepSeek-R1-Distill-Qwen-32B, Light-R1-32B |
Hugging Face | Model Repository |
What is FuseO1-QwQ-DeepSeekR1-LightR1-32B?
FuseO1-QwQ-DeepSeekR1-LightR1-32B is an advanced language model that combines the strengths of three powerful base models through FuseAI's innovative SCE merging method. The model demonstrates exceptional performance in mathematical reasoning tasks, particularly showing improvements on AIME24 benchmarks with +1.4 Pass@1 and +3.3 Pass@16 compared to the best base model.
Implementation Details
The model utilizes a sophisticated fusion approach to merge knowledge from QwQ-32B, DeepSeek-R1, and Light-R1 models. It maintains consistent performance across various sampling parameters, with optimal results using temperature=0.6 and top-p=0.95 settings.
- Achieves 77.9% Pass@1 on AIME24 benchmark
- Demonstrates 86.7% consistency rate (Cons@16)
- Shows significant improvements in mathematical reasoning capabilities
Core Capabilities
- Enhanced mathematical problem-solving abilities
- Strong performance on AIME benchmarks
- Consistent reasoning across complex problems
- Flexible sampling parameter optimization
Frequently Asked Questions
Q: What makes this model unique?
The model's unique strength lies in its merged architecture that combines three powerful base models, resulting in superior performance on mathematical reasoning tasks while maintaining high consistency in outputs.
Q: What are the recommended use cases?
The model is particularly well-suited for mathematical problem-solving, especially in competitive mathematics scenarios like AIME problems. It shows strong capabilities in step-by-step reasoning and solution generation.