r1-1776-GGUF
Property | Value |
---|---|
Author | Unsloth |
Model Type | Reasoning Language Model |
Format | GGUF (2-bit, 3-bit, 4-bit variants) |
Base Model | DeepSeek-R1 |
Repository | Hugging Face |
What is r1-1776-GGUF?
r1-1776-GGUF is an optimized version of DeepSeek-R1, post-trained by Perplexity AI to remove CCP censorship while maintaining high reasoning capabilities. The model features innovative dynamic quantization techniques, offering multiple compression levels (2-bit, 3-bit, and 4-bit) without significant accuracy loss.
Implementation Details
The model implements selective quantization with different bit rates for various components. The 2-bit version (UD-Q2_K_XL) uses mixture-of-experts with 2.5-bit quantization and specialized 3.5/2.5-bit handling for down projections, achieving a compact 211GB size while maintaining performance.
- Supports GPU acceleration with configurable layer offloading
- Implements efficient cache management with Q4_0 K quantized cache
- Provides flexible context window up to 8192 tokens
- Features specialized prompt formatting with User/Assistant tokens
Core Capabilities
- Unbiased reasoning and factual information generation
- Multi-lingual support with comprehensive evaluation
- Maintained mathematical and logical reasoning abilities
- Efficient memory usage with dynamic quantization
Frequently Asked Questions
Q: What makes this model unique?
The model combines advanced quantization techniques with censorship removal, offering a balance between size efficiency (211GB-377GB) and performance. It's notable for maintaining accuracy while providing unbiased responses to sensitive topics.
Q: What are the recommended use cases?
The model is ideal for applications requiring strong reasoning capabilities, factual information generation, and unbiased responses. It's particularly suitable for deployment in environments with GPU acceleration and where memory efficiency is important.