Light-R1-32B

Light-R1-32B

qihoo360

A powerful 32B parameter math-focused model achieving SOTA AIME24 scores (76.6%). Trained via curriculum SFT & DPO for only $1000, surpassing DeepSeek-R1.

PropertyValue
Base ModelQwen2.5-32B-Instruct
LicenseApache 2.0
Training Cost~$1000 (6 hours on 12 x H800)
AIME24 Score76.6 (64-run average)

What is Light-R1-32B?

Light-R1-32B is a groundbreaking mathematical reasoning model that achieves state-of-the-art performance on challenging mathematics competitions like AIME. Built on Qwen2.5-32B-Instruct, it demonstrates superior performance through an innovative curriculum learning approach combining Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO).

Implementation Details

The model employs a three-stage training process: curriculum SFT stage1 with 76k data points, SFT stage2 with 3k more difficult problems, and finally DPO training. The training data is carefully curated from various public math datasets and decontaminated against common benchmarks.

  • Utilizes curriculum learning with progressive difficulty levels
  • Implements forced thinking through special tokens (<think>)
  • Leverages model merging for optimal performance
  • Trained on decontaminated mathematical datasets

Core Capabilities

  • 76.6% accuracy on AIME24 (averaged over 64 runs)
  • 64.6% accuracy on AIME25
  • 61.8% score on GPQA Diamond
  • Strong mathematical reasoning and step-by-step problem solving

Frequently Asked Questions

Q: What makes this model unique?

Light-R1-32B achieves superior performance on mathematical reasoning tasks while being trained at a fraction of the cost (~$1000) compared to other models. It's also fully open-source with available training code and datasets.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, particularly in competition-level mathematics. It's specifically designed for scenarios requiring detailed mathematical reasoning and step-by-step solution generation.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026