codegen25-7b-multi_P

Maintained By
Salesforce

CodeGen2.5-7B-multi

PropertyValue
LicenseApache-2.0
Training DataStarCoderData
PaperCodeGen2.5 Paper
AuthorsErik Nijkamp, Hiroaki Hayashi, et al.

What is CodeGen2.5-7B-multi?

CodeGen2.5-7B-multi is an advanced autoregressive language model specifically designed for program synthesis. Built by Salesforce, this model represents a significant evolution in code generation capabilities, trained on 1.4T tokens from StarCoderData. Notable for achieving competitive results compared to StarCoderBase-15.5B while using less than half the parameters.

Implementation Details

The model implements a sophisticated architecture that supports both standard code completion and infill capabilities. It utilizes the transformers architecture and can be easily integrated using the AutoModelForCausalLM framework. The implementation requires OpenAI's tiktoken for tokenization and supports multiple programming languages.

  • Supports both causal sampling and infill sampling modes
  • Uses specialized tokens like for infilling operations
  • Implements efficient token handling with tiktoken integration
  • Provides straightforward API for code generation tasks

Core Capabilities

  • Multi-language program synthesis
  • Code completion and autocompletion
  • Code infilling with context awareness
  • Natural language to code generation
  • Support for multiple programming languages

Frequently Asked Questions

Q: What makes this model unique?

CodeGen2.5-7B-multi stands out for its ability to achieve performance comparable to much larger models while maintaining a smaller parameter count. It's particularly notable for its infilling capabilities and multi-language support, making it versatile for various programming tasks.

Q: What are the recommended use cases?

The model is best suited for program synthesis tasks, including generating executable code from English prompts, code completion, and code infilling. It's particularly effective when prompts are formatted as comment strings and can handle partial code completion across multiple programming languages.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.