CodeParrot

Property	Value
Model Size	1.5B parameters
Architecture	GPT-2
Training Data	CodeParrot Clean Dataset
Latest Version	v1.1
HumanEval Score	3.99% (pass@1)

What is CodeParrot?

CodeParrot is a sophisticated GPT-2-based language model specifically trained for Python code generation. Built with 1.5 billion parameters, it represents a significant advancement in AI-powered code assistance, trained on a carefully curated dataset of Python code.

Implementation Details

The model underwent two training phases, with v1.1 being trained for an additional 30,000 steps beyond the initial v1.0 release. Training utilized 16 A100 GPUs with 40GB memory each, processing approximately 41 billion tokens across both versions. The model implements gradient checkpointing and uses a cosine learning rate schedule with careful parameter tuning.

Context window size of 1024 tokens
Batch size of 512 with 16x gradient accumulation
Optimized learning rates: 2e-4 (v1.0) and 5e-5 (v1.1)
Weight decay of 0.1 and 750 warmup steps

Core Capabilities

Python code generation from natural language or code prompts
Achieves 17.88% pass@100 on HumanEval benchmark
Seamless integration with Hugging Face Transformers library
Supports both direct model usage and pipeline implementation

Frequently Asked Questions

Q: What makes this model unique?

CodeParrot stands out for its specialized training on Python code, making it particularly effective for code generation tasks. The model's size and architecture are optimized for understanding and generating programming patterns, with demonstrable improvements across versions as shown by HumanEval benchmark results.

Q: What are the recommended use cases?

The model is best suited for Python code generation tasks, including code completion, function generation from descriptions, and code suggestions. It can be particularly useful in development environments where Python code assistance is needed, though users should note the performance metrics when considering production use.

codeparrot