Qwen2.5-Coder-1.5B-Instruct-AWQ

Property	Value
Parameter Count	1.54B (1.31B Non-Embedding)
Model Type	Causal Language Model (Instruction-tuned)
Architecture	Transformer with RoPE, SwiGLU, RMSNorm
Context Length	32,768 tokens
Quantization	AWQ 4-bit
Paper	arXiv:2409.12186

What is Qwen2.5-Coder-1.5B-Instruct-AWQ?

Qwen2.5-Coder-1.5B-Instruct-AWQ is part of the latest Qwen2.5-Coder series, representing a significant advancement in code-specific language models. This particular version is a 4-bit quantized model optimized for code generation, reasoning, and fixing tasks, trained on 5.5 trillion tokens including source code and text-code grounding data.

Implementation Details

The model implements a sophisticated architecture featuring 28 layers and a unique attention head configuration with 12 heads for queries and 2 for key-value pairs. It utilizes advanced techniques including RoPE (Rotary Position Embedding), SwiGLU activation, and RMSNorm, while maintaining an impressive 32K token context window.

Advanced GQA (Grouped Query Attention) implementation
AWQ 4-bit quantization for efficient deployment
Comprehensive instruction tuning for code-specific tasks
Full 32,768 token context length support

Core Capabilities

Code generation and completion
Code reasoning and analysis
Bug fixing and code optimization
Mathematical problem-solving
General programming assistance

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its efficient 4-bit quantization while maintaining high performance in code-related tasks. It's built on the strong foundation of Qwen2.5 and specifically optimized for programming applications while keeping general competencies intact.

Q: What are the recommended use cases?

This model is ideal for code development workflows, including code generation, debugging, and technical documentation. It's particularly suitable for environments where computational efficiency is crucial, thanks to its AWQ quantization.