tiny_starcoder_py

Maintained By
bigcode

Tiny StarCoder Py

PropertyValue
Parameter Count164M
Training DataStarCoderData (Python)
Context Length8k tokens
LicenseBigCode OpenRAIL-M
Training Infrastructure32 Tesla A100 GPUs

What is tiny_starcoder_py?

Tiny StarCoder Py is a compact 164M parameter language model specifically designed for Python code generation. Built on the same architecture as its larger sibling StarCoder, it utilizes multi-query attention and Fill-in-the-Middle (FIM) capabilities while maintaining a smaller footprint. The model was trained on Python data from StarCoderData for approximately 6 epochs, processing 100B tokens.

Implementation Details

The model is implemented using the Transformers library and features several technical innovations:

  • 8k context length capability for handling larger code segments
  • Multi-Query Attention mechanism for efficient processing
  • Fill-in-the-Middle objective for code completion tasks
  • Training completed in 18 hours using 32 Tesla A100 GPUs
  • Built with PyTorch and orchestrated through Megatron-LM

Core Capabilities

  • Python code generation and completion
  • Assisted Generation support
  • Fill-in-the-Middle code completion
  • 7.84% pass@1 performance on HumanEval benchmark

Frequently Asked Questions

Q: What makes this model unique?

This model offers a lightweight alternative to larger code generation models while maintaining key StarCoder architecture features. Its 164M parameter size makes it more accessible for deployment while still providing useful code generation capabilities.

Q: What are the recommended use cases?

The model is best suited for assisted code generation tasks in Python. While it can handle code completion, the developers recommend using the 15B parameter StarCoder or StarCoderBase models for pure code completion tasks requiring higher accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.