DeciCoder-1b
Property | Value |
---|---|
Parameter Count | 1.1B |
License | Apache 2.0 |
Architecture | Decoder-only with GQA |
Context Length | 2048 tokens |
Training Data | StarCoder Dataset |
Supported Languages | Python, Java, JavaScript |
What is DeciCoder-1b?
DeciCoder-1b is a specialized code generation model developed by Deci AI, designed specifically for efficient and accurate code completion tasks. Built using proprietary Neural Architecture Search technology (AutoNAC), it implements Grouped Query Attention (GQA) and leverages a Fill-in-the-Middle training objective to deliver optimal performance.
Implementation Details
The model architecture comprises 20 layers with 32 attention heads and utilizes a 2048-dimensional hidden size. It employs Rotary Position Embeddings and was trained on 446B tokens with a global batch size of 768 using the AdamW optimizer.
- Warm-up steps: 9000 with 284k total training steps
- Learning rate: 4e-4 with cosine schedule
- Weight decay: 0.1
- Performance: 19.1% pass@1 on Python HumanEval
Core Capabilities
- Single/multiline code completion
- Context-aware code generation
- Multi-language support (Python, Java, JavaScript)
- Efficient processing with GQA architecture
Frequently Asked Questions
Q: What makes this model unique?
DeciCoder-1b stands out for its efficient architecture using Grouped Query Attention and AutoNAC technology, offering strong performance despite its relatively compact size of 1.1B parameters.
Q: What are the recommended use cases?
The model excels at code completion tasks when provided with context in the form of source code comments or function signatures. It's not designed for direct instruction-following but rather for context-based code generation.