TinyR1-32B-Preview

Maintained By
qihoo360

TinyR1-32B-Preview

PropertyValue
Model Size32B parameters
Authorqihoo360
Release DateMarch 2025
Model HubHugging Face

What is TinyR1-32B-Preview?

TinyR1-32B-Preview is a groundbreaking reasoning-focused language model that achieves remarkable performance comparable to larger models while using significantly fewer parameters. Built through supervised fine-tuning of Deepseek-R1-Distill-Qwen-32B, it specializes in mathematics, coding, and science tasks, outperforming the 70B parameter models in several benchmarks.

Implementation Details

The model was developed using the 360-LLaMA-Factory training framework and leverages the Mergekit tool to combine domain-specific models. It demonstrates impressive benchmark scores, achieving 78.1% on AIME 2024, 61.6% on LiveCodeBench, and 65.0% on GPQA-Diamond.

  • Trained on 58.3k mathematics trajectories
  • Incorporates 19k coding examples
  • Uses 8.6k science-focused training samples
  • Implements specialized domain merging technique

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Strong coding performance across various tasks
  • Scientific reasoning and analysis
  • Efficient parameter utilization through model distillation

Frequently Asked Questions

Q: What makes this model unique?

TinyR1-32B-Preview achieves near-R1 performance with just 32B parameters, demonstrating exceptional efficiency through specialized domain training and innovative model merging techniques.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and scientific reasoning. It's particularly well-suited for applications requiring step-by-step reasoning and complex problem-solving in these domains.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.