DeepSeek-R1-Distill-Llama-70B-AWQ

Property	Value
Base Model	Llama-3.3-70B-Instruct
Quantization	4-bit AWQ
License	MIT License (with Llama 3.3 restrictions)
Group Size	128

What is DeepSeek-R1-Distill-Llama-70B-AWQ?

This model is a 4-bit quantized version of DeepSeek-R1-Distill-Llama-70B, which was distilled from the powerful DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining impressive performance across various benchmarks including mathematics, code generation, and reasoning tasks.

Implementation Details

The model uses AutoAWQ version 0.2.8 for quantization with specific configurations including zero_point enabled, q_group_size of 128, and 4-bit weight quantization using the GEMM version. This implementation allows for efficient deployment while preserving model capabilities.

Achieves 70.0 pass@1 on AIME 2024
Scores 94.5 pass@1 on MATH-500
Demonstrates 65.2 pass@1 on GPQA Diamond
Achieves 1633 rating on CodeForces

Core Capabilities

Advanced mathematical reasoning and problem-solving
Code generation and understanding
Complex reasoning tasks
Efficient deployment through 4-bit quantization

Frequently Asked Questions

Q: What makes this model unique?

This model combines the power of DeepSeek-R1's reasoning capabilities with efficient 4-bit quantization, making it practical for deployment while maintaining high performance on complex tasks.

Q: What are the recommended use cases?

The model excels in mathematical reasoning, coding tasks, and complex problem-solving scenarios. It's particularly well-suited for applications requiring advanced reasoning capabilities while operating under memory constraints.