InstructCodeT5+ 16B
Property | Value |
---|---|
Author | Salesforce |
License | BSD-3-Clause |
Paper | CodeT5+: Open Code Large Language Models |
Supported Languages | Python, Java, JavaScript, C++, C#, PHP, Ruby, Go, C |
What is instructcodet5p-16b?
InstructCodeT5+ 16B is a state-of-the-art code language model that represents a significant advancement in the CodeT5+ family. This model features a unique encoder-decoder architecture that can operate in multiple modes (encoder-only, decoder-only, and encoder-decoder) to handle various code understanding and generation tasks. Built upon the success of its predecessors, it incorporates instruction tuning to better align with natural language instructions.
Implementation Details
The model employs a compute-efficient pretraining method, utilizing a "shallow encoder and deep decoder" architecture. The encoder is initialized from CodeGen-350M-mono, while the decoder leverages CodeGen-16B-mono. It's trained on a permissively licensed subset of the github-code dataset and supports nine programming languages.
- Diverse pretraining tasks including span denoising, causal language modeling, contrastive learning, and text-code matching
- Instruction-tuned following the Code Alpaca approach
- Implements efficient scaling techniques to reach 16B parameters
Core Capabilities
- Advanced code understanding and generation across multiple programming languages
- State-of-the-art performance in text-to-code generation (35.0% pass@1 on HumanEval)
- Superior performance in code completion and retrieval tasks
- Excellent results in math programming tasks, outperforming larger models
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its ability to flexibly operate in different modes while maintaining high performance across various code-related tasks. Its instruction-tuning and efficient architecture make it particularly powerful for real-world applications.
Q: What are the recommended use cases?
The model excels in code generation, code completion, text-to-code retrieval, and mathematical programming tasks. It's particularly well-suited for developers needing assistance with code generation across multiple programming languages and for applications requiring strong code understanding capabilities.