ZR1-1.5B

Maintained By
Zyphra

ZR1-1.5B

PropertyValue
Parameter Count1.5B
Model TypeReasoning and Coding Model
AuthorZyphra
Hugging FaceZyphra/ZR1-1.5B

What is ZR1-1.5B?

ZR1-1.5B is a specialized AI model that focuses on reasoning tasks, particularly in mathematics and coding. Despite its relatively small size of 1.5B parameters, it demonstrates remarkable performance, outperforming Llama-3.1-70B-Instruct on hard coding tasks and achieving a 37.91% pass@1 accuracy on GPQA-Diamond. The model represents a significant improvement over its base R1-Distill-1.5B model, showing over 50% better performance.

Implementation Details

The model was trained using the PRIME (Process Reinforcement through IMplicit rEwards) algorithm, leveraging a dataset of approximately 400k math and 25k code samples. Training was conducted on an 8xH100 node setup, utilizing progressive context lengthening from 8k to 24k tokens.

  • Trained on verified coding and mathematics problems using reinforcement learning
  • Employs PRIME + RLOO with token-level granularity
  • Uses dynamic batch sizing with accuracy filtering
  • Implements iterative context lengthening for improved efficiency

Core Capabilities

  • Achieves 74% accuracy on AMPS Hard coding tasks
  • Shows strong performance across various math benchmarks including AIME, AMC, and Olympiad
  • Demonstrates 40% accuracy on Leetcode problems
  • Maintains high performance with both sampling and greedy decoding
  • Supports context lengths up to 32,768 tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to achieve performance comparable to much larger models while maintaining a compact 1.5B parameter size makes it unique. It demonstrates that careful training and architecture design can compensate for model size in specialized tasks.

Q: What are the recommended use cases?

ZR1-1.5B is particularly well-suited for mathematical reasoning, coding problems, and general problem-solving tasks. It excels in scenarios requiring step-by-step reasoning and can handle both short and long-form responses with high accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.