starcoder2-15b

Maintained By
bigcode

StarCoder2-15B

PropertyValue
Parameter Count15 Billion
Training DataThe Stack v2 (600+ programming languages)
Context Window16,384 tokens
LicenseBigCode OpenRAIL-M
PaperLink to Paper

What is StarCoder2-15B?

StarCoder2-15B is a state-of-the-art code generation model that represents a significant advancement in AI-powered programming assistance. Trained on over 4 trillion tokens across 600+ programming languages, this model employs sophisticated architectural features including Grouped Query Attention and a sliding window attention mechanism of 4,096 tokens. The model was developed using NVIDIA's NeMo Framework and trained on the NVIDIA Eos Supercomputer.

Implementation Details

The model leverages several cutting-edge technical innovations:

  • Fill-in-the-Middle (FIM) training objective for improved code understanding
  • 16,384 token context window with 4,096 token sliding window attention
  • Trained with bfloat16 precision on 1024 H100 GPUs
  • Supports multiple precision options including 8-bit and 4-bit quantization

Core Capabilities

  • Achieves 46.3% pass@1 on HumanEval benchmark
  • 33.8% pass@1 on DS-1000 dataset
  • 65.1% accuracy on GSM8K (PAL)
  • 74.08% edit-similarity on RepoBench-v1.1
  • Supports both CPU and GPU inference with various optimization options

Frequently Asked Questions

Q: What makes this model unique?

StarCoder2-15B stands out for its massive scale of training data, sophisticated attention mechanisms, and strong performance across various programming languages. It's particularly notable for not being an instruction-tuned model, focusing instead on pure code generation capabilities.

Q: What are the recommended use cases?

The model excels at code generation tasks when provided with appropriate context. It's best suited for code completion, generation, and understanding tasks across hundreds of programming languages. However, it's important to note that it's not designed for natural language instructions and works best with direct code-related inputs.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.