ZR1-1.5B

Property	Value
Parameter Count	1.5B
Model Type	Reasoning and Coding Model
Author	Zyphra
Hugging Face	Zyphra/ZR1-1.5B

What is ZR1-1.5B?

ZR1-1.5B is a specialized AI model that focuses on reasoning tasks, particularly in mathematics and coding. Despite its relatively small size of 1.5B parameters, it demonstrates remarkable performance, outperforming Llama-3.1-70B-Instruct on hard coding tasks and achieving a 37.91% pass@1 accuracy on GPQA-Diamond. The model represents a significant improvement over its base R1-Distill-1.5B model, showing over 50% better performance.

Implementation Details

The model was trained using the PRIME (Process Reinforcement through IMplicit rEwards) algorithm, leveraging a dataset of approximately 400k math and 25k code samples. Training was conducted on an 8xH100 node setup, utilizing progressive context lengthening from 8k to 24k tokens.

Trained on verified coding and mathematics problems using reinforcement learning
Employs PRIME + RLOO with token-level granularity
Uses dynamic batch sizing with accuracy filtering
Implements iterative context lengthening for improved efficiency

Core Capabilities

Achieves 74% accuracy on AMPS Hard coding tasks
Shows strong performance across various math benchmarks including AIME, AMC, and Olympiad
Demonstrates 40% accuracy on Leetcode problems
Maintains high performance with both sampling and greedy decoding
Supports context lengths up to 32,768 tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to achieve performance comparable to much larger models while maintaining a compact 1.5B parameter size makes it unique. It demonstrates that careful training and architecture design can compensate for model size in specialized tasks.

Q: What are the recommended use cases?

ZR1-1.5B is particularly well-suited for mathematical reasoning, coding problems, and general problem-solving tasks. It excels in scenarios requiring step-by-step reasoning and can handle both short and long-form responses with high accuracy.

ZR1-1.5B

ZR1-1.5B

What is ZR1-1.5B?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models