Qwen2.5-0.5B-Instruct-AWQ

Qwen2.5-0.5B-Instruct-AWQ

Qwen

Qwen2.5-0.5B-Instruct-AWQ is a 4-bit quantized instruction-tuned LLM with 0.5B parameters, offering multilingual support and 32K context window

PropertyValue
Parameter Count0.49B (0.36B Non-Embedding)
Model TypeCausal Language Model (Instruction-tuned)
ArchitectureTransformer with RoPE, SwiGLU, RMSNorm
Context Length32,768 tokens
QuantizationAWQ 4-bit
Model URLHugging Face

What is Qwen2.5-0.5B-Instruct-AWQ?

Qwen2.5-0.5B-Instruct-AWQ is a compact, efficient language model that represents the latest advancement in Qwen's series of large language models. This 4-bit quantized version maintains impressive capabilities while significantly reducing the computational requirements. It features 24 layers and an innovative attention structure with 14 heads for queries and 2 for key-values.

Implementation Details

The model implements several cutting-edge architectural features, including Rotary Position Embedding (RoPE), SwiGLU activation functions, and RMSNorm layer normalization. The quantization using AWQ 4-bit compression enables efficient deployment while maintaining performance.

  • 24 transformer layers with optimized architecture
  • Group-Query Attention (GQA) with 14:2 head ratio
  • Full 32,768 token context window with 8,192 token generation capacity
  • AWQ 4-bit quantization for efficient deployment

Core Capabilities

  • Multilingual support for 29+ languages including Chinese, English, and major European languages
  • Enhanced instruction following and structured data handling
  • Improved capabilities in coding and mathematics
  • Long-form content generation up to 8K tokens
  • Efficient processing of structured data and JSON output

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient 4-bit quantization while maintaining robust capabilities across multiple languages and tasks. It offers an impressive context window of 32K tokens despite its compact size of 0.5B parameters.

Q: What are the recommended use cases?

The model is well-suited for multilingual applications, code generation, mathematical computations, and scenarios requiring structured data handling. It's particularly effective for deployments where computational efficiency is crucial while maintaining good performance.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026