CodeParrot-Small
Property | Value |
---|---|
Parameter Count | 110M |
Model Type | GPT-2 |
License | Apache 2.0 |
Training Data | CodeParrot Clean Dataset |
What is codeparrot-small?
CodeParrot-small is a specialized GPT-2 based language model designed specifically for Python code generation. Trained on the cleaned CodeParrot dataset, this 110M parameter model represents a lightweight alternative for code generation tasks, making it accessible for deployment in resource-constrained environments.
Implementation Details
The model was trained on 16 A100 GPUs with a substantial dataset of approximately 29 billion tokens. Key training parameters include a batch size of 192, context size of 1024 tokens, and 150,000 training steps using a cosine learning rate schedule.
- Optimized with a learning rate of 5e-4 and weight decay of 0.1
- Implements 2000 warmup steps for stable training
- Utilizes the Transformers library for easy integration
Core Capabilities
- Python code generation from prompts
- Achieves 3.80% pass@1 on HumanEval benchmark
- Scales to 12.78% pass@100 for multiple generation attempts
- Seamless integration with Hugging Face's transformers library
Frequently Asked Questions
Q: What makes this model unique?
CodeParrot-small stands out for its efficient architecture that balances performance with model size, making it particularly suitable for developers who need a lightweight code generation solution while maintaining reasonable performance metrics.
Q: What are the recommended use cases?
The model is best suited for Python code generation tasks, code completion, and development assistance. It's particularly valuable in scenarios where computational resources are limited but code generation capabilities are needed.