DeepSeek-R1-Distill-Qwen-32B-bnb-4bit

Property	Value
Base Model	Qwen2.5-32B
Quantization	4-bit BNB
License	MIT License
Context Length	32,768 tokens

What is DeepSeek-R1-Distill-Qwen-32B-bnb-4bit?

This is a 4-bit quantized version of the DeepSeek-R1-Distill-Qwen-32B model, which is a distilled version of the larger DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining exceptional performance, particularly in reasoning tasks. The model outperforms OpenAI's o1-mini across various benchmarks and achieves state-of-the-art results for dense models.

Implementation Details

The model is implemented using BitsAndBytes (BNB) 4-bit quantization to reduce memory requirements while preserving performance. It's built on the Qwen2.5-32B architecture and has been fine-tuned using 800k samples curated with DeepSeek-R1. The model maintains a context length of 32,768 tokens and is optimized for deployment using vLLM.

Achieves 72.6% pass@1 on AIME 2024
94.3% accuracy on MATH-500
62.1% pass@1 on GPQA Diamond
57.2% pass@1 on LiveCodeBench
1691 rating on CodeForces

Core Capabilities

Advanced mathematical reasoning and problem-solving
Strong coding and software engineering capabilities
Complex logical reasoning and analysis
Long-context understanding and processing
Efficient memory usage through 4-bit quantization

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines the reasoning capabilities of DeepSeek-R1 with efficient 4-bit quantization, making it accessible for deployment while maintaining high performance on challenging tasks like mathematics and coding.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and complex reasoning scenarios. It's particularly well-suited for applications requiring strong analytical capabilities while operating within memory constraints.