Qwen2.5-Coder-1.5B-Instruct-AWQ
Property | Value |
---|---|
Parameter Count | 1.54B (1.31B Non-Embedding) |
Model Type | Causal Language Model (Instruction-tuned) |
Architecture | Transformer with RoPE, SwiGLU, RMSNorm |
Context Length | 32,768 tokens |
Quantization | AWQ 4-bit |
Paper | arXiv:2409.12186 |
What is Qwen2.5-Coder-1.5B-Instruct-AWQ?
Qwen2.5-Coder-1.5B-Instruct-AWQ is part of the latest Qwen2.5-Coder series, representing a significant advancement in code-specific language models. This particular version is a 4-bit quantized model optimized for code generation, reasoning, and fixing tasks, trained on 5.5 trillion tokens including source code and text-code grounding data.
Implementation Details
The model implements a sophisticated architecture featuring 28 layers and a unique attention head configuration with 12 heads for queries and 2 for key-value pairs. It utilizes advanced techniques including RoPE (Rotary Position Embedding), SwiGLU activation, and RMSNorm, while maintaining an impressive 32K token context window.
- Advanced GQA (Grouped Query Attention) implementation
- AWQ 4-bit quantization for efficient deployment
- Comprehensive instruction tuning for code-specific tasks
- Full 32,768 token context length support
Core Capabilities
- Code generation and completion
- Code reasoning and analysis
- Bug fixing and code optimization
- Mathematical problem-solving
- General programming assistance
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its efficient 4-bit quantization while maintaining high performance in code-related tasks. It's built on the strong foundation of Qwen2.5 and specifically optimized for programming applications while keeping general competencies intact.
Q: What are the recommended use cases?
This model is ideal for code development workflows, including code generation, debugging, and technical documentation. It's particularly suitable for environments where computational efficiency is crucial, thanks to its AWQ quantization.