Phind-CodeLlama-34B-Python-v1
Property | Value |
---|---|
Base Model | CodeLlama-34B-Python |
License | Llama2 |
Training Hardware | 32x A100-80GB GPUs |
Training Time | 90 GPU-hours |
What is Phind-CodeLlama-34B-Python-v1?
Phind-CodeLlama-34B-Python-v1 is a sophisticated code generation model that represents a significant advancement in AI-powered programming assistance. Fine-tuned from CodeLlama-34B-Python, this model achieves an impressive 69.5% pass@1 rate on HumanEval, surpassing GPT-4's performance of 67%. The model was trained on a proprietary dataset of approximately 80,000 high-quality programming problems and solutions.
Implementation Details
The model underwent a comprehensive training process using DeepSpeed ZeRO 3 and Flash Attention 2, completed in just three hours on 32 A100-80GB GPUs. The training utilized a sequence length of 4096 tokens and involved two epochs, totaling approximately 160,000 examples.
- Native finetune implementation (no LoRA)
- Trained on instruction-answer pairs
- Implements OpenAI's decontamination methodology
- Uses transformers library from the main git branch
Core Capabilities
- High-performance code generation and completion
- Superior performance on Python programming tasks
- Instruction-tuned for direct task execution
- 4096 token context window
- Supports various programming problem-solving scenarios
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its exceptional performance on the HumanEval benchmark, surpassing even GPT-4. It's specifically optimized for Python programming tasks and trained on a high-quality, decontaminated dataset of programming problems.
Q: What are the recommended use cases?
The model excels at code generation tasks, particularly in Python. It's best used with direct instructions followed by "\n: " rather than chat-style interactions. Ideal for code completion, implementation of algorithms, and solving programming problems.