KwaiCoder-DS-V2-Lite-Base
Property | Value |
---|---|
Total Parameters | 16B (2.4B activated) |
License | MIT License |
Model Type | Code Generation & Mathematical Reasoning |
Hugging Face | Model Repository |
What is KwaiCoder-DS-V2-Lite-Base?
KwaiCoder-DS-V2-Lite-Base is an advanced language model built on Deepseek-v2-Lite-Base, specifically designed for code generation and mathematical problem-solving. The model underwent extensive pretraining on 800B tokens of high-quality data, with a carefully curated mix of 70% code, 20% math, and 10% text data.
Implementation Details
The model features a sophisticated architecture with 16B total parameters but only 2.4B activated parameters, making it highly efficient. It supports both English and Chinese languages and can be easily implemented using the Hugging Face Transformers library for both code completion and code insertion tasks.
- Achieved 75.0% Pass@1 on HumanEval and 68.9% on HumanEval+
- Outperformed larger models like CodeLlama-34B on mathematical tasks
- Significantly improved performance on BigCodeBench benchmark
Core Capabilities
- Advanced code generation and completion
- Code insertion and modification
- Mathematical problem solving
- Bilingual support (English and Chinese)
- Superior performance on complex coding tasks
Frequently Asked Questions
Q: What makes this model unique?
The model's exceptional performance despite its efficient architecture sets it apart. With only 2.4B activated parameters, it achieves SOTA results across multiple benchmarks, outperforming many larger models.
Q: What are the recommended use cases?
The model excels in code generation, algorithm implementation, mathematical problem-solving, and bilingual programming tasks. It's particularly suitable for developers needing efficient code completion and technical problem-solving assistance.