DeepSeek-R1-Distill-Qwen-32B-bnb-4bit
Property | Value |
---|---|
Base Model | Qwen2.5-32B |
Quantization | 4-bit BNB |
License | MIT License |
Context Length | 32,768 tokens |
What is DeepSeek-R1-Distill-Qwen-32B-bnb-4bit?
This is a 4-bit quantized version of the DeepSeek-R1-Distill-Qwen-32B model, which is a distilled version of the larger DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining exceptional performance, particularly in reasoning tasks. The model outperforms OpenAI's o1-mini across various benchmarks and achieves state-of-the-art results for dense models.
Implementation Details
The model is implemented using BitsAndBytes (BNB) 4-bit quantization to reduce memory requirements while preserving performance. It's built on the Qwen2.5-32B architecture and has been fine-tuned using 800k samples curated with DeepSeek-R1. The model maintains a context length of 32,768 tokens and is optimized for deployment using vLLM.
- Achieves 72.6% pass@1 on AIME 2024
- 94.3% accuracy on MATH-500
- 62.1% pass@1 on GPQA Diamond
- 57.2% pass@1 on LiveCodeBench
- 1691 rating on CodeForces
Core Capabilities
- Advanced mathematical reasoning and problem-solving
- Strong coding and software engineering capabilities
- Complex logical reasoning and analysis
- Long-context understanding and processing
- Efficient memory usage through 4-bit quantization
Frequently Asked Questions
Q: What makes this model unique?
This model uniquely combines the reasoning capabilities of DeepSeek-R1 with efficient 4-bit quantization, making it accessible for deployment while maintaining high performance on challenging tasks like mathematics and coding.
Q: What are the recommended use cases?
The model excels in mathematical problem-solving, coding tasks, and complex reasoning scenarios. It's particularly well-suited for applications requiring strong analytical capabilities while operating within memory constraints.