DeepCoder-1.5B-Preview

Maintained By
agentica-org

DeepCoder-1.5B-Preview

PropertyValue
Base ModelDeepSeek-R1-Distilled-Qwen-1.5B
Training ApproachGRPO+ with Iterative Context Lengthening
LicenseMIT License
Model URLhuggingface.co/agentica-org/DeepCoder-1.5B-Preview

What is DeepCoder-1.5B-Preview?

DeepCoder-1.5B-Preview is an advanced code reasoning language model that leverages distributed reinforcement learning to enhance code generation capabilities. Fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B, it demonstrates significant improvements in coding benchmarks, achieving 25.1% on LiveCodeBench (v5) and 73.0% on HumanEval+.

Implementation Details

The model employs an enhanced version of GRPO (GRPO+) combined with iterative context lengthening. The training dataset comprises 24K unique problem-tests pairs from Taco-Verified, PrimeIntellect SYNTHETIC-1, and LiveCodeBench v5.

  • Offline Difficulty Filtering for stable training
  • Removal of entropy and KL loss components
  • Overlong Filtering for preserving long-context reasoning
  • Modified clip high bounds for improved exploration

Core Capabilities

  • Context length handling up to 64K
  • Codeforces Rating: 963 (28.5 percentile)
  • Superior performance compared to base model across multiple benchmarks
  • Compatible with various serving systems including vLLM, HuggingFace TGI, SGLang, and TensorRT-LLM

Frequently Asked Questions

Q: What makes this model unique?

The model's unique GRPO+ training approach and iterative context lengthening enable superior code generation capabilities while maintaining stability during training. It successfully eliminates common training issues like entropy collapse while achieving strong performance on coding benchmarks.

Q: What are the recommended use cases?

DeepCoder-1.5B-Preview is particularly suited for code generation tasks, problem-solving in competitive programming scenarios, and handling long-context coding challenges. It's ideal for developers needing assistance with complex coding tasks while working within extended context windows.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.