TinyR1-32B-Preview

Property	Value
Model Size	32B parameters
Author	qihoo360
Release Date	March 2025
Model Hub	Hugging Face

What is TinyR1-32B-Preview?

TinyR1-32B-Preview is a groundbreaking reasoning-focused language model that achieves remarkable performance comparable to larger models while using significantly fewer parameters. Built through supervised fine-tuning of Deepseek-R1-Distill-Qwen-32B, it specializes in mathematics, coding, and science tasks, outperforming the 70B parameter models in several benchmarks.

Implementation Details

The model was developed using the 360-LLaMA-Factory training framework and leverages the Mergekit tool to combine domain-specific models. It demonstrates impressive benchmark scores, achieving 78.1% on AIME 2024, 61.6% on LiveCodeBench, and 65.0% on GPQA-Diamond.

Trained on 58.3k mathematics trajectories
Incorporates 19k coding examples
Uses 8.6k science-focused training samples
Implements specialized domain merging technique

Core Capabilities

Advanced mathematical reasoning and problem-solving
Strong coding performance across various tasks
Scientific reasoning and analysis
Efficient parameter utilization through model distillation

Frequently Asked Questions

Q: What makes this model unique?

TinyR1-32B-Preview achieves near-R1 performance with just 32B parameters, demonstrating exceptional efficiency through specialized domain training and innovative model merging techniques.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and scientific reasoning. It's particularly well-suited for applications requiring step-by-step reasoning and complex problem-solving in these domains.