Light-R1-7B-DS

Maintained By
qihoo360

Light-R1-7B-DS

PropertyValue
Authorqihoo360
Base ModelDeepSeek-R1-Distill-Qwen-7B
Model Size7B parameters
Release DateMarch 12, 2025
HuggingFaceLink

What is Light-R1-7B-DS?

Light-R1-7B-DS represents a significant breakthrough in mathematical reasoning AI models, achieving state-of-the-art performance with remarkably efficient training. Built upon DeepSeek-R1-Distill-Qwen-7B, this model has been fine-tuned with just 3,000 carefully curated training examples, demonstrating exceptional performance on challenging mathematical benchmarks.

Implementation Details

The model builds upon the DeepSeek-R1-Distill-Qwen-7B architecture and implements rigorous data decontamination practices, including exact matching (excluding digits) and N-gram (N=32) matching to ensure benchmark integrity. It maintains the same usage pattern as its base model while achieving superior performance.

  • Achieves 59.1% accuracy on AIME24 and 44.3% on AIME25
  • 49.4% performance on GPQA without specific training
  • Trained with only 3K SFT data
  • Implements thorough data decontamination protocols

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Strong performance on American Invitational Mathematics Examination (AIME) problems
  • Generalized problem-solving capabilities demonstrated through GPQA performance
  • Efficient learning from limited training data

Frequently Asked Questions

Q: What makes this model unique?

Light-R1-7B-DS stands out for achieving state-of-the-art performance on mathematical reasoning tasks with minimal training data (3K examples), demonstrating exceptional efficiency in learning and generalization.

Q: What are the recommended use cases?

The model is particularly suited for mathematical problem-solving, especially in competitive mathematics contexts and general mathematical reasoning tasks. It shows strong capabilities in both specialized mathematics (AIME) and general problem-solving (GPQA).

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.