codet5p-16b

Maintained By
Salesforce

CodeT5+ 16B

PropertyValue
AuthorSalesforce
LicenseBSD-3-Clause
PaperView Paper
Supported LanguagesPython, Java, JavaScript, Go, C++, C#, PHP, Ruby, C

What is CodeT5+ 16B?

CodeT5+ 16B is an advanced large language model specifically designed for code understanding and generation tasks. It features a unique encoder-decoder architecture that can operate in multiple modes: encoder-only, decoder-only, and encoder-decoder. This flexibility allows it to handle a wide range of programming tasks effectively. The model represents a significant upgrade from the original CodeT5 family, scaling up from previous versions (220M and 770M) to a robust 16B parameters.

Implementation Details

The model employs a compute-efficient pretraining method, initializing its components with frozen off-the-shelf LLMs - specifically CodeGen-350M-mono for the encoder and CodeGen-16B-mono for the decoder. It utilizes a "shallow encoder and deep decoder" architecture and is trained on a carefully curated, permissively licensed subset of the github-code dataset.

  • Diverse pretraining tasks including span denoising, causal language modeling, contrastive learning, and text-code matching
  • Supports both unimodal code data and bimodal code-text data
  • Implements instruction tuning following Code Alpaca methodology

Core Capabilities

  • Advanced code completion with state-of-the-art performance
  • Text-to-code retrieval with +3.2 average MRR improvement
  • Code generation with 35.0% pass@1 and 54.5% pass@10 on HumanEval
  • Exceptional performance in math programming tasks
  • Multi-language support across 9 programming languages

Frequently Asked Questions

Q: What makes this model unique?

CodeT5+ 16B stands out for its flexible architecture that can operate in multiple modes and its impressive scale of 16B parameters. It achieves state-of-the-art results in code generation tasks, even surpassing some closed-source models like OpenAI's code-cushman-001.

Q: What are the recommended use cases?

The model excels in code generation, code completion, and code understanding tasks. It's particularly effective for Python code generation, text-to-code retrieval, and solving complex programming problems, including mathematical programming tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.