DeepSeek-R1-Distill-Llama-70B-AWQ

Maintained By
Valdemardi

DeepSeek-R1-Distill-Llama-70B-AWQ

PropertyValue
Base ModelLlama-3.3-70B-Instruct
Quantization4-bit AWQ
LicenseMIT License (with Llama 3.3 restrictions)
Group Size128

What is DeepSeek-R1-Distill-Llama-70B-AWQ?

This model is a 4-bit quantized version of DeepSeek-R1-Distill-Llama-70B, which was distilled from the powerful DeepSeek-R1 model. It represents a significant achievement in model compression while maintaining impressive performance across various benchmarks including mathematics, code generation, and reasoning tasks.

Implementation Details

The model uses AutoAWQ version 0.2.8 for quantization with specific configurations including zero_point enabled, q_group_size of 128, and 4-bit weight quantization using the GEMM version. This implementation allows for efficient deployment while preserving model capabilities.

  • Achieves 70.0 pass@1 on AIME 2024
  • Scores 94.5 pass@1 on MATH-500
  • Demonstrates 65.2 pass@1 on GPQA Diamond
  • Achieves 1633 rating on CodeForces

Core Capabilities

  • Advanced mathematical reasoning and problem-solving
  • Code generation and understanding
  • Complex reasoning tasks
  • Efficient deployment through 4-bit quantization

Frequently Asked Questions

Q: What makes this model unique?

This model combines the power of DeepSeek-R1's reasoning capabilities with efficient 4-bit quantization, making it practical for deployment while maintaining high performance on complex tasks.

Q: What are the recommended use cases?

The model excels in mathematical reasoning, coding tasks, and complex problem-solving scenarios. It's particularly well-suited for applications requiring advanced reasoning capabilities while operating under memory constraints.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.