OpenCoder-1.5B-Instruct
Property | Value |
---|---|
Parameter Count | 1.91B |
Model Type | Code Generation LLM |
Architecture | Transformer-based |
License | inf |
Paper | View Paper |
Context Length | 4K tokens |
What is OpenCoder-1.5B-Instruct?
OpenCoder-1.5B-Instruct is part of the OpenCoder family of code generation models, specifically designed for both English and Chinese programming tasks. This instruction-tuned version builds upon a base model trained on 2.5 trillion tokens, comprising 90% raw code and 10% code-related web data. The model has been fine-tuned on over 4.5M high-quality supervised examples to achieve state-of-the-art performance in code generation tasks.
Implementation Details
The model utilizes a BF16 tensor format and supports a 4K token context length. It's built on transformers architecture and has been extensively tested through comprehensive ablation studies on various data-cleaning strategies.
- Pretrained on 2.5T tokens with optimized data distribution
- Supervised fine-tuning with 4.5M high-quality examples
- Supports both English and Chinese programming instructions
- Implements advanced file-level and repository-level deduplication
Core Capabilities
- Strong performance on HumanEval (72.5%) and MBPP (72.7%) benchmarks
- Bilingual code generation and understanding
- 4K token context window for handling larger code snippets
- Comprehensive support for various programming tasks
- Production-ready with commercial license support
Frequently Asked Questions
Q: What makes this model unique?
OpenCoder-1.5B-Instruct stands out for its complete transparency, including released training data, checkpoints, and extensive documentation. It's one of the few models that provides full access to its training pipeline and synthetic data generation process.
Q: What are the recommended use cases?
The model excels in code generation, code completion, and programming assistance in both English and Chinese. It's particularly effective for software development, code documentation, and educational purposes in programming.