Tiny StarCoder Py

Property	Value
Parameter Count	164M
Training Data	StarCoderData (Python)
Context Length	8k tokens
License	BigCode OpenRAIL-M
Training Infrastructure	32 Tesla A100 GPUs

What is tiny_starcoder_py?

Tiny StarCoder Py is a compact 164M parameter language model specifically designed for Python code generation. Built on the same architecture as its larger sibling StarCoder, it utilizes multi-query attention and Fill-in-the-Middle (FIM) capabilities while maintaining a smaller footprint. The model was trained on Python data from StarCoderData for approximately 6 epochs, processing 100B tokens.

Implementation Details

The model is implemented using the Transformers library and features several technical innovations:

8k context length capability for handling larger code segments
Multi-Query Attention mechanism for efficient processing
Fill-in-the-Middle objective for code completion tasks
Training completed in 18 hours using 32 Tesla A100 GPUs
Built with PyTorch and orchestrated through Megatron-LM

Core Capabilities

Python code generation and completion
Assisted Generation support
Fill-in-the-Middle code completion
7.84% pass@1 performance on HumanEval benchmark

Frequently Asked Questions

Q: What makes this model unique?

This model offers a lightweight alternative to larger code generation models while maintaining key StarCoder architecture features. Its 164M parameter size makes it more accessible for deployment while still providing useful code generation capabilities.

Q: What are the recommended use cases?

The model is best suited for assisted code generation tasks in Python. While it can handle code completion, the developers recommend using the 15B parameter StarCoder or StarCoderBase models for pure code completion tasks requiring higher accuracy.