CodeFuse-CodeLlama-34B

Property	Value
Base Model	CodeLlama-34b-Python
License	Other
Paper	arXiv:2311.02303
HumanEval Score	74.4% (pass@1)

What is CodeFuse-CodeLlama-34B?

CodeFuse-CodeLlama-34B is a state-of-the-art code generation model that has been fine-tuned using QLoRA on CodeLlama-34b-Python with 600,000 code instructions and answers. The model demonstrates exceptional performance, achieving 74.4% accuracy on the HumanEval benchmark, surpassing both GPT-4 and other open-source alternatives.

Implementation Details

The model utilizes a 4K context length during fine-tuning, with the capability to extend to 16K if needed. It's implemented using PyTorch and requires CUDA 11.4 for optimal performance. The architecture supports both regular and 4-bit quantized versions, making it adaptable to various computational resources.

Supports multiple programming languages with a focus on Python
Uses specialized role-based prompting format for interactions
Implements efficient tokenization with left-side padding
Provides flexible deployment options through FasterTransformer

Core Capabilities

Superior code generation with 74.4% HumanEval pass@1 accuracy
Multi-turn conversation support through structured prompting
Efficient memory usage with 4-bit quantization option
Extensive context understanding up to 4K tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its state-of-the-art performance on code generation tasks, achieved through careful fine-tuning of the CodeLlama base model using QLoRA. Its 74.4% accuracy on HumanEval represents the current SOTA for open-source code LLMs.

Q: What are the recommended use cases?

The model excels in code generation, completion, and understanding tasks. It's particularly well-suited for Python development, technical documentation, and code-related conversational tasks. The model can be deployed in both research and production environments, with options for optimization through quantization.