DeepSeek-R1-Distill-Qwen-32B-bnb-4bit

DeepSeek-R1-Distill-Qwen-32B-bnb-4bit

unsloth

4-bit quantized 32B parameter Qwen model distilled from DeepSeek-R1, optimized for reasoning tasks with strong math and coding capabilities.

PropertyValue
Base ModelQwen2.5-32B
Quantization4-bit BNB
LicenseMIT License
Context Length32,768 tokens

What is DeepSeek-R1-Distill-Qwen-32B-bnb-4bit?

This is a 4-bit quantized version of the DeepSeek-R1-Distill-Qwen-32B model, which is a distilled version of the larger DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining exceptional performance, particularly in reasoning tasks. The model outperforms OpenAI's o1-mini across various benchmarks and achieves state-of-the-art results for dense models.

Implementation Details

The model is implemented using BitsAndBytes (BNB) 4-bit quantization to reduce memory requirements while preserving performance. It's built on the Qwen2.5-32B architecture and has been fine-tuned using 800k samples curated with DeepSeek-R1. The model maintains a context length of 32,768 tokens and is optimized for deployment using vLLM.

  • Achieves 72.6% pass@1 on AIME 2024
  • 94.3% accuracy on MATH-500
  • 62.1% pass@1 on GPQA Diamond
  • 57.2% pass@1 on LiveCodeBench
  • 1691 rating on CodeForces

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Strong coding and software engineering capabilities
  • Complex logical reasoning and analysis
  • Long-context understanding and processing
  • Efficient memory usage through 4-bit quantization

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines the reasoning capabilities of DeepSeek-R1 with efficient 4-bit quantization, making it accessible for deployment while maintaining high performance on challenging tasks like mathematics and coding.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and complex reasoning scenarios. It's particularly well-suited for applications requiring strong analytical capabilities while operating within memory constraints.

Socials
Integrations
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026