TinyR1-32B-Preview

Property	Value
Parameter Count	32 Billion
Base Model	Deepseek-R1-Distill-Qwen-32B
Model Hub	Hugging Face
Training Framework	360-LLaMA-Factory

What is tinyR1-32B-preview-exl2?

TinyR1-32B-Preview is a groundbreaking language model that achieves near-R1 performance while using only 5% of the parameters. This model represents a significant advancement in model efficiency, outperforming the 70B parameter Deepseek-R1-Distill-Llama model in mathematical reasoning tasks.

Implementation Details

The model was developed using a sophisticated SuperDistillation approach, combining three domain-specific models trained through supervised fine-tuning. The training process utilized the 360-LLaMA-Factory framework and leveraged Mergekit for model combination. The training data included 58.3k math CoT trajectories, 19k coding trajectories, and 60.8k science CoT trajectories.

Mathematics performance: 78.1% on AIME 2024
Coding performance: 61.6% on LiveCodeBench
Science performance: 65.0% on GPQA-Diamond

Core Capabilities

Advanced mathematical reasoning comparable to the full R1 model
Strong coding abilities demonstrated through LiveCodeBench performance
Robust scientific reasoning capabilities
Efficient parameter utilization through SuperDistillation

Frequently Asked Questions

Q: What makes this model unique?

This model achieves remarkable performance using only 32B parameters, nearly matching the capabilities of larger models while being more efficient. Its SuperDistillation approach enables strong performance across mathematics, coding, and science domains.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and scientific reasoning. It's particularly well-suited for applications requiring strong analytical capabilities while maintaining computational efficiency.