Light-R1-32B-DS

Maintained By
qihoo360

Light-R1-32B-DS

PropertyValue
Base ModelDeepSeek-R1-Distill-Qwen-32B
Release DateMarch 12, 2025
PaperView Paper
AIME24 Score78.1
AIME25 Score65.9

What is Light-R1-32B-DS?

Light-R1-32B-DS is a state-of-the-art 32B parameter mathematical reasoning model that achieves impressive performance on challenging math benchmarks. Built upon DeepSeek-R1-Distill-Qwen-32B, this model demonstrates exceptional capabilities despite being trained on a remarkably small dataset of just 3,000 examples.

Implementation Details

The model maintains the architecture of its base while incorporating careful data decontamination practices. The training process involved thorough validation against benchmark contamination using exact matching (excluding digits) and 32-gram matching techniques.

  • Achieves 78.1% on AIME24 and 65.9% on AIME25 benchmarks
  • 68.0% performance on GPQA
  • Implements robust data decontamination protocols
  • Built on DeepSeek-R1-Distill-Qwen-32B architecture

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Efficient performance with minimal training data
  • Strong performance on standardized math benchmarks
  • Maintains data integrity through careful decontamination

Frequently Asked Questions

Q: What makes this model unique?

The model achieves near-SOTA performance using only 3,000 training examples, demonstrating exceptional efficiency in learning from limited data while maintaining high performance on challenging mathematical benchmarks.

Q: What are the recommended use cases?

The model is particularly well-suited for mathematical reasoning tasks, especially those requiring advanced problem-solving capabilities similar to AIME-level mathematics.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.