Light-R1-14B-DS

Light-R1-14B-DS

qihoo360

Light-R1-14B-DS is a 14B parameter SOTA math model achieving impressive AIME scores (74.0/60.2), featuring successful RL implementation on long-COT finetuned models.

PropertyValue
Authorqihoo360
Base ModelDeepSeek-R1-Distill-Qwen-14B
Release DateMarch 12, 2025
Model URLHugging Face

What is Light-R1-14B-DS?

Light-R1-14B-DS represents a breakthrough in mathematical reasoning capabilities, being the first open-source model to successfully implement Reinforcement Learning (RL) on long-COT finetuned models under light computational budget. The model achieves state-of-the-art performance for 14B parameter models, with impressive scores of 74.0 and 60.2 on AIME 24 & 25 respectively.

Implementation Details

Built upon DeepSeek-R1-Distill-Qwen-14B, this model underwent specialized long-COT RL Post-Training. The training process demonstrated the expected behavior of simultaneous increases in response length and reward scores, marking a significant advancement in RL implementation for mathematical reasoning.

  • Careful data decontamination process using exact matching and 32-gram matching
  • Specialized training focusing on maintaining data integrity
  • Successful implementation of RL on already long-COT finetuned models

Core Capabilities

  • State-of-the-art performance on AIME mathematics benchmarks
  • Strong performance on GPQA without specific training (61.7)
  • Enhanced long-form Chain of Thought reasoning
  • Efficient performance under light computational requirements

Frequently Asked Questions

Q: What makes this model unique?

The model represents the first successful attempt at applying RL to already long-COT finetuned models in a computationally efficient manner, achieving SOTA results that outperform many 32B models.

Q: What are the recommended use cases?

The model excels in mathematical reasoning tasks, particularly in complex problem-solving scenarios requiring detailed step-by-step solutions, making it ideal for educational applications and mathematical research.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026