Tiny StarCoder Py
Property | Value |
---|---|
Parameter Count | 164M |
Training Data | StarCoderData (Python) |
Context Length | 8k tokens |
License | BigCode OpenRAIL-M |
Training Infrastructure | 32 Tesla A100 GPUs |
What is tiny_starcoder_py?
Tiny StarCoder Py is a compact 164M parameter language model specifically designed for Python code generation. Built on the same architecture as its larger sibling StarCoder, it utilizes multi-query attention and Fill-in-the-Middle (FIM) capabilities while maintaining a smaller footprint. The model was trained on Python data from StarCoderData for approximately 6 epochs, processing 100B tokens.
Implementation Details
The model is implemented using the Transformers library and features several technical innovations:
- 8k context length capability for handling larger code segments
- Multi-Query Attention mechanism for efficient processing
- Fill-in-the-Middle objective for code completion tasks
- Training completed in 18 hours using 32 Tesla A100 GPUs
- Built with PyTorch and orchestrated through Megatron-LM
Core Capabilities
- Python code generation and completion
- Assisted Generation support
- Fill-in-the-Middle code completion
- 7.84% pass@1 performance on HumanEval benchmark
Frequently Asked Questions
Q: What makes this model unique?
This model offers a lightweight alternative to larger code generation models while maintaining key StarCoder architecture features. Its 164M parameter size makes it more accessible for deployment while still providing useful code generation capabilities.
Q: What are the recommended use cases?
The model is best suited for assisted code generation tasks in Python. While it can handle code completion, the developers recommend using the 15B parameter StarCoder or StarCoderBase models for pure code completion tasks requiring higher accuracy.