Phind-CodeLlama-34B-v2-GPTQ

Maintained By
TheBloke

Phind-CodeLlama-34B-v2-GPTQ

PropertyValue
Base ModelCodeLlama 34B v2
Parameter Count34 Billion
Training Data1.5B tokens of programming data
HumanEval Score73.8% pass@1
Training Infrastructure32 A100-80GB GPUs
Training Duration15 hours (480 GPU-hours)

What is Phind-CodeLlama-34B-v2-GPTQ?

Phind-CodeLlama-34B-v2-GPTQ is a quantized version of the state-of-the-art code generation model that builds upon CodeLlama 34B. This GPTQ-quantized variant maintains the exceptional performance of the original model while reducing its size and memory requirements, making it more accessible for practical use. The model excels at multi-language programming, including Python, C/C++, TypeScript, and Java.

Implementation Details

The model has been fine-tuned on 1.5B tokens of high-quality programming problems and solutions, using DeepSpeed ZeRO 3 and Flash Attention 2. The quantization is available in multiple configurations, from 3-bit to 4-bit with various group sizes, allowing users to balance between performance and resource requirements.

  • Multiple quantization options (3-bit to 4-bit)
  • Configurable group sizes (32g, 64g, 128g)
  • Supports sequence length of 8192 tokens
  • Compatible with AutoGPTQ and Transformers pipeline

Core Capabilities

  • State-of-the-art code generation with 73.8% pass@1 on HumanEval
  • Multi-language programming support
  • Instruction-tuned using Alpaca/Vicuna format
  • Efficient memory usage through quantization
  • Comprehensive programming problem-solving abilities

Frequently Asked Questions

Q: What makes this model unique?

This model represents the current state-of-the-art in open-source code generation, achieving an impressive 73.8% pass@1 on HumanEval. Its quantized nature makes it more accessible while maintaining high performance, and it supports multiple programming languages effectively.

Q: What are the recommended use cases?

The model excels at code generation, debugging, and programming assistance across multiple languages. It's particularly well-suited for developers needing AI assistance in Python, C/C++, TypeScript, and Java programming tasks, while being resource-efficient through quantization.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.