TinyR1-32B-Preview
Property | Value |
---|---|
Model Size | 32B parameters |
Author | qihoo360 |
Release Date | March 2025 |
Model Hub | Hugging Face |
What is TinyR1-32B-Preview?
TinyR1-32B-Preview is a groundbreaking reasoning-focused language model that achieves remarkable performance comparable to larger models while using significantly fewer parameters. Built through supervised fine-tuning of Deepseek-R1-Distill-Qwen-32B, it specializes in mathematics, coding, and science tasks, outperforming the 70B parameter models in several benchmarks.
Implementation Details
The model was developed using the 360-LLaMA-Factory training framework and leverages the Mergekit tool to combine domain-specific models. It demonstrates impressive benchmark scores, achieving 78.1% on AIME 2024, 61.6% on LiveCodeBench, and 65.0% on GPQA-Diamond.
- Trained on 58.3k mathematics trajectories
- Incorporates 19k coding examples
- Uses 8.6k science-focused training samples
- Implements specialized domain merging technique
Core Capabilities
- Advanced mathematical reasoning and problem-solving
- Strong coding performance across various tasks
- Scientific reasoning and analysis
- Efficient parameter utilization through model distillation
Frequently Asked Questions
Q: What makes this model unique?
TinyR1-32B-Preview achieves near-R1 performance with just 32B parameters, demonstrating exceptional efficiency through specialized domain training and innovative model merging techniques.
Q: What are the recommended use cases?
The model excels in mathematical problem-solving, coding tasks, and scientific reasoning. It's particularly well-suited for applications requiring step-by-step reasoning and complex problem-solving in these domains.