EXAONE-Deep-2.4B-GGUF
Property | Value |
---|---|
Parameters | 2.14B |
Context Length | 32,768 tokens |
License | EXAONE AI Model License Agreement 1.1 - NC |
Architecture | 30 layers, GQA with 32 Q-heads and 8 KV-heads |
Vocabulary Size | 102,400 |
What is EXAONE-Deep-2.4B-GGUF?
EXAONE-Deep-2.4B-GGUF is an advanced language model developed by LG AI Research, specifically designed for superior reasoning capabilities in mathematics and coding tasks. This model represents a significant achievement in balancing model size and performance, offering impressive capabilities in a relatively compact 2.4B parameter package.
Implementation Details
The model features a sophisticated architecture utilizing Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads, spread across 30 layers. It supports an extensive context window of 32,768 tokens and implements tied word embeddings, distinguishing it from its larger siblings in the EXAONE family.
- Multiple quantization options including Q8_0, Q6_K, Q5_K_M, Q4_K_M, and IQ4_XS in GGUF format
- Extensive vocabulary size of 102,400 tokens
- Optimized for deployment across various frameworks including TensorRT-LLM, vLLM, and llama.cpp
Core Capabilities
- Enhanced reasoning abilities for mathematical problems
- Strong performance in coding tasks
- Structured thought process with
tags - Competitive performance against larger models
- Efficient deployment options across multiple frameworks
Frequently Asked Questions
Q: What makes this model unique?
EXAONE-Deep-2.4B stands out for its exceptional reasoning capabilities despite its relatively small size, outperforming comparable models in its class. The implementation of GQA attention and extensive context window make it particularly effective for complex reasoning tasks.
Q: What are the recommended use cases?
The model excels in mathematical reasoning and coding tasks. It's particularly effective when used with structured prompts that include step-by-step reasoning instructions, especially for math problems where using \boxed{} notation for final answers is recommended. The model performs best when initialized with proper thought processes and minimal system prompts.