Phind-CodeLlama-34B-v2
Property | Value |
---|---|
License | LLaMA 2 |
Training Hardware | 32x A100-80GB GPUs |
Training Time | 15 hours (480 GPU-hours) |
HumanEval Score | 73.8% pass@1 |
What is Phind-CodeLlama-34B-v2?
Phind-CodeLlama-34B-v2 is a state-of-the-art open-source code generation model that represents a significant advancement in AI-powered programming assistance. Built upon its predecessor (v1), this model has been fine-tuned on an additional 1.5B tokens of high-quality programming data, achieving an impressive 73.8% pass@1 rate on HumanEval benchmarks.
Implementation Details
The model features a native fine-tuning approach without LoRA, utilizing DeepSpeed ZeRO 3 and Flash Attention 2 for efficient training. It processes sequences up to 4096 tokens and follows the Alpaca/Vicuna instruction format for improved steerability.
- Multi-lingual support for Python, C/C++, TypeScript, Java, and other languages
- Trained using 32 A100-80GB GPUs for 15 hours
- Implements instruction-tuning for better human interaction
- Uses proprietary dataset of programming problems and solutions
Core Capabilities
- State-of-the-art code generation with 73.8% pass@1 on HumanEval
- Comprehensive multi-language support
- Instruction-following capabilities using Alpaca/Vicuna format
- Efficient processing of complex programming tasks
- Enhanced context understanding with 4096 token sequence length
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for achieving the highest HumanEval pass@1 score (73.8%) among open-source models, combining advanced code generation capabilities with instruction-following abilities.
Q: What are the recommended use cases?
The model excels in code generation, programming assistance, and multi-language development tasks. It's particularly suitable for developers needing help with Python, C/C++, TypeScript, and Java programming.