CodeT5+ 16B

Property	Value
Author	Salesforce
License	BSD-3-Clause
Paper	View Paper
Supported Languages	Python, Java, JavaScript, Go, C++, C#, PHP, Ruby, C

What is CodeT5+ 16B?

CodeT5+ 16B is an advanced large language model specifically designed for code understanding and generation tasks. It features a unique encoder-decoder architecture that can operate in multiple modes: encoder-only, decoder-only, and encoder-decoder. This flexibility allows it to handle a wide range of programming tasks effectively. The model represents a significant upgrade from the original CodeT5 family, scaling up from previous versions (220M and 770M) to a robust 16B parameters.

Implementation Details

The model employs a compute-efficient pretraining method, initializing its components with frozen off-the-shelf LLMs - specifically CodeGen-350M-mono for the encoder and CodeGen-16B-mono for the decoder. It utilizes a "shallow encoder and deep decoder" architecture and is trained on a carefully curated, permissively licensed subset of the github-code dataset.

Diverse pretraining tasks including span denoising, causal language modeling, contrastive learning, and text-code matching
Supports both unimodal code data and bimodal code-text data
Implements instruction tuning following Code Alpaca methodology

Core Capabilities

Advanced code completion with state-of-the-art performance
Text-to-code retrieval with +3.2 average MRR improvement
Code generation with 35.0% pass@1 and 54.5% pass@10 on HumanEval
Exceptional performance in math programming tasks
Multi-language support across 9 programming languages

Frequently Asked Questions

Q: What makes this model unique?

CodeT5+ 16B stands out for its flexible architecture that can operate in multiple modes and its impressive scale of 16B parameters. It achieves state-of-the-art results in code generation tasks, even surpassing some closed-source models like OpenAI's code-cushman-001.

Q: What are the recommended use cases?

The model excels in code generation, code completion, and code understanding tasks. It's particularly effective for Python code generation, text-to-code retrieval, and solving complex programming problems, including mathematical programming tasks.

codet5p-16b