DeepSeek-R1-Distill-Qwen-1.5B-GGUF

Maintained By
unsloth

DeepSeek-R1-Distill-Qwen-1.5B-GGUF

PropertyValue
Base ModelQwen2.5-Math-1.5B
LicenseMIT License
FormatGGUF
PaperarXiv:2501.12948

What is DeepSeek-R1-Distill-Qwen-1.5B-GGUF?

DeepSeek-R1-Distill-Qwen-1.5B-GGUF is a compressed and optimized version of the larger DeepSeek-R1 model, specifically designed for efficient local deployment. It represents a significant achievement in model distillation, preserving advanced reasoning capabilities while reducing the model size to just 1.5B parameters.

Implementation Details

The model is implemented in GGUF format, making it compatible with llama.cpp for local deployment. It features specialized tokens for chat interactions (<|User|> and <|Assistant|>) and supports both CPU and GPU acceleration.

  • Supports efficient inference with llama.cpp
  • Optimized for both CPU and GPU deployment
  • Implements temperature control (recommended 0.6) for stable outputs
  • Maximum generation length of 32,768 tokens

Core Capabilities

  • Strong mathematical reasoning abilities (83.9% on MATH-500 benchmark)
  • Step-by-step problem solving
  • Efficient memory usage with GGUF format
  • Support for both inference and fine-tuning

Frequently Asked Questions

Q: What makes this model unique?

This model represents a successful distillation of advanced reasoning capabilities from the larger DeepSeek-R1 model into a much smaller and more accessible 1.5B parameter version, while maintaining impressive performance on mathematical and reasoning tasks.

Q: What are the recommended use cases?

The model is particularly well-suited for mathematical reasoning, step-by-step problem solving, and general reasoning tasks. It's ideal for users who need a lightweight but capable model for local deployment with limited computational resources.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.