Replit Code V-1.5 3B
Property | Value |
---|---|
Parameter Count | 3.3B |
Context Length | 4096 tokens |
Training Data | 1T tokens |
License | Apache 2.0 |
Framework | PyTorch |
What is replit-code-v1_5-3b?
Replit Code v1.5-3B is a specialized code completion model developed by Replit, Inc. It's trained on a massive dataset of 1T tokens spanning 30 programming languages, using permissively licensed code from the Stack Dedup dataset and carefully curated content from RedPajama's StackExchange dataset. The model utilizes a custom-trained vocabulary of 32,768 tokens, optimized for efficient code representation while maintaining high coverage.
Implementation Details
The model is implemented using PyTorch and trained on MosaicML's platform using 128 H100-80GB GPUs. It leverages their LLM Foundry and Composer training library for optimal performance. The model supports both standard transformers implementation and Triton-based Flash Attention for enhanced efficiency.
- Custom GPTNeoX tokenizer with optimized vocabulary
- Trained in bfloat16 precision
- Supports comprehensive code generation across 30 programming languages
- 4096 token context window
Core Capabilities
- Advanced code completion across multiple programming languages
- Natural language processing for developer-oriented content
- Efficient token compression while maintaining coverage
- Flexible deployment options with both standard and Flash Attention implementations
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on code completion, custom-optimized vocabulary that achieves better compression, and broad language support covering 30 programming languages. The combination of code and developer-oriented natural language training makes it particularly suitable for real-world development scenarios.
Q: What are the recommended use cases?
The model is primarily designed for code completion tasks and can be used as a foundational model for application-specific fine-tuning. It's suitable for both commercial and non-commercial applications, though users should be mindful of potential limitations regarding generated content quality and appropriateness.