Open-RS3
Property | Value |
---|---|
Base Model | DeepSeek-R1-Distill-Qwen-1.5B |
Parameter Count | 1.5B |
Paper | arXiv:2503.16219 |
Model URL | Hugging Face |
What is Open-RS3?
Open-RS3 is an enhanced version of the DeepSeek-R1-Distill-Qwen-1.5B language model, specifically optimized for mathematical reasoning through reinforcement learning. This model represents a significant advancement in achieving strong reasoning capabilities with relatively small parameter counts, demonstrating that effective mathematical reasoning doesn't always require massive models.
Implementation Details
The model was trained using an efficient reinforcement learning approach on 4 A40 GPUs, completing training in under 24 hours at a cost of approximately $42. The training process utilized 7,000 samples, generating 42,000 total outputs, making it a highly cost-effective solution compared to traditional approaches.
- Achieves 56.3% average score across benchmarks
- 80% accuracy on AMC23 mathematics tests
- 46.7% accuracy on AIME24, surpassing o1-preview's 44.6%
- Competitive performance on MATH-500 benchmark
Core Capabilities
- Advanced mathematical reasoning and problem-solving
- Efficient performance on standardized mathematics tests
- Cost-effective training approach for resource-constrained environments
- Improved reasoning capabilities compared to baseline models
Frequently Asked Questions
Q: What makes this model unique?
Open-RS3 stands out for achieving impressive mathematical reasoning capabilities with a relatively small 1.5B parameter count, demonstrating that effective reasoning can be achieved through efficient RL training rather than just scaling model size.
Q: What are the recommended use cases?
The model is particularly well-suited for mathematical problem-solving applications, educational tools, and scenarios requiring advanced reasoning capabilities within resource-constrained environments.