DeepSeek-R1-Distill-Llama-70B-AWQ
Property | Value |
---|---|
Base Model | Llama-3.3-70B-Instruct |
Quantization | 4-bit AWQ |
License | MIT License (with Llama 3.3 restrictions) |
Group Size | 128 |
What is DeepSeek-R1-Distill-Llama-70B-AWQ?
This model is a 4-bit quantized version of DeepSeek-R1-Distill-Llama-70B, which was distilled from the powerful DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining impressive performance across various benchmarks including mathematics, code generation, and reasoning tasks.
Implementation Details
The model uses AutoAWQ version 0.2.8 for quantization with specific configurations including zero_point enabled, q_group_size of 128, and 4-bit weight quantization using the GEMM version. This implementation allows for efficient deployment while preserving model capabilities.
- Achieves 70.0 pass@1 on AIME 2024
- Scores 94.5 pass@1 on MATH-500
- Demonstrates 65.2 pass@1 on GPQA Diamond
- Achieves 1633 rating on CodeForces
Core Capabilities
- Advanced mathematical reasoning and problem-solving
- Code generation and understanding
- Complex reasoning tasks
- Efficient deployment through 4-bit quantization
Frequently Asked Questions
Q: What makes this model unique?
This model combines the power of DeepSeek-R1's reasoning capabilities with efficient 4-bit quantization, making it practical for deployment while maintaining high performance on complex tasks.
Q: What are the recommended use cases?
The model excels in mathematical reasoning, coding tasks, and complex problem-solving scenarios. It's particularly well-suited for applications requiring advanced reasoning capabilities while operating under memory constraints.