DeepCoder-14B-Preview
Property | Value |
---|---|
Parameter Count | 14 Billion |
Base Model | DeepSeek-R1-Distilled-Qwen-14B |
License | MIT License |
LiveCodeBench Score | 60.6% Pass@1 |
Model URL | HuggingFace/agentica-org/DeepCoder-14B-Preview |
What is DeepCoder-14B-Preview?
DeepCoder-14B-Preview represents a significant advancement in code reasoning language models, fine-tuned using distributed reinforcement learning techniques. The model achieves remarkable performance comparable to OpenAI's o3-mini while maintaining a smaller parameter count of 14B. It's particularly notable for its ability to handle long context lengths up to 64K tokens.
Implementation Details
The model employs an enhanced version of GRPO (GRPO+) combined with iterative context lengthening. The training dataset comprises 24K unique problem-tests pairs from sources including Taco-Verified, PrimeIntellect SYNTHETIC-1, and LiveCodeBench v5.
- Implements offline difficulty filtering for stable training
- Removes entropy and KL loss for improved stability
- Utilizes overlong filtering for preserving long-context reasoning
- Features iterative context lengthening from 16K to 32K to 64K
Core Capabilities
- 60.6% Pass@1 accuracy on LiveCodeBench v5
- 1936 Codeforces Rating (95.3 percentile)
- 92.6% success rate on HumanEval+
- Supports context lengths up to 64K tokens
- Compatible with major inference systems including vLLM, HuggingFace TGI, SGLang, and TensorRT-LLM
Frequently Asked Questions
Q: What makes this model unique?
DeepCoder-14B-Preview stands out for achieving performance comparable to proprietary models while being fully open-source. Its improved GRPO+ training methodology and context length capabilities make it particularly effective for complex coding tasks.
Q: What are the recommended use cases?
The model excels at code generation and reasoning tasks, with optimal performance achieved using a temperature of 0.6, top_p of 0.95, and max_tokens set to 64000. It's designed for direct instruction processing without requiring system prompts.