Stable Code 3B

Property	Value
Parameter Count	2.7B
Model Type	Decoder-only Transformer
Architecture	LLaMA-based with modifications
License	Stability AI Community License
Context Length	16,384 tokens

What is stable-code-3b?

Stable Code 3B is a state-of-the-art code generation model developed by Stability AI. It's a 2.7B parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. The model demonstrates exceptional performance across multiple programming languages, outperforming many larger models in benchmarks.

Implementation Details

The model is built on a modified LLaMA architecture featuring 32 layers, 32 attention heads, and a hidden size of 2560. It implements Rotary Position Embeddings applied to 25% of head embedding dimensions and uses a modified GPTNeoX tokenizer with special tokens for Fill-in-Middle capabilities.

Trained on 18 programming languages including Python, Java, JavaScript, C++, and Rust
Achieves 32.4% pass@1 on Python HumanEval benchmarks
Supports sequences up to 16,384 tokens in length
Implements Flash Attention 2 for improved performance

Core Capabilities

Fill in Middle (FIM) functionality for code completion
Multi-language code generation and understanding
Long context handling with 16K token support
State-of-the-art performance in similar-sized models
Optimized for both accuracy and efficiency

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional performance despite its relatively modest size, outperforming larger models like CodeLLama 7B in several languages. It also features built-in Fill-in-Middle capabilities and extensive language support.

Q: What are the recommended use cases?

The model is ideal for code generation, completion, and understanding tasks across multiple programming languages. It's particularly well-suited for development environments requiring multi-language support and can serve as a foundation for fine-tuning on specific tasks.

stable-code-3b