EXAONE-Deep-2.4B-AWQ

EXAONE-Deep-2.4B-AWQ

LGAI-EXAONE

EXAONE-Deep-2.4B-AWQ is a quantized 2.4B parameter language model optimized for reasoning tasks, featuring 32K context length and GQA attention.

PropertyValue
Parameter Count2.14B (without embeddings)
Context Length32,768 tokens
LicenseEXAONE AI Model License Agreement 1.1 - NC
QuantizationAWQ 4-bit group-wise weight-only (W4A16g128)
Architecture30 layers, GQA with 32 Q-heads and 8 KV-heads

What is EXAONE-Deep-2.4B-AWQ?

EXAONE-Deep-2.4B-AWQ is a sophisticated language model developed by LG AI Research, specifically optimized for complex reasoning tasks including mathematics and coding. This quantized version maintains high performance while reducing computational requirements through AWQ quantization.

Implementation Details

The model implements several advanced technical features that set it apart from comparable models:

  • Vocabulary size of 102,400 tokens
  • Grouped-Query Attention (GQA) architecture with 32 query heads and 8 KV heads
  • 4-bit quantization for efficient deployment
  • Tied word embeddings for parameter efficiency
  • Extended context length of 32,768 tokens

Core Capabilities

  • Superior performance in mathematical reasoning tasks
  • Advanced coding capabilities
  • Efficient processing of long-context inputs
  • Optimized for deployment across various frameworks (TensorRT-LLM, vLLM, SGLang)
  • Streaming inference support

Frequently Asked Questions

Q: What makes this model unique?

EXAONE-Deep-2.4B-AWQ stands out for its exceptional reasoning capabilities despite its relatively compact size, outperforming other models in its parameter range. The model's integration of GQA attention and extensive context window makes it particularly effective for complex reasoning tasks.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and general reasoning applications. It's particularly well-suited for scenarios requiring step-by-step reasoning and long-context understanding. The model performs best when prompts begin with thought processes and includes specific instructions for reasoning steps.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026