EXAONE-Deep-2.4B-GGUF

Maintained By
LGAI-EXAONE

EXAONE-Deep-2.4B-GGUF

PropertyValue
Parameters2.14B
Context Length32,768 tokens
LicenseEXAONE AI Model License Agreement 1.1 - NC
Architecture30 layers, GQA with 32 Q-heads and 8 KV-heads
Vocabulary Size102,400

What is EXAONE-Deep-2.4B-GGUF?

EXAONE-Deep-2.4B-GGUF is an advanced language model developed by LG AI Research, specifically designed for superior reasoning capabilities in mathematics and coding tasks. This model represents a significant achievement in balancing model size and performance, offering impressive capabilities in a relatively compact 2.4B parameter package.

Implementation Details

The model features a sophisticated architecture utilizing Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads, spread across 30 layers. It supports an extensive context window of 32,768 tokens and implements tied word embeddings, distinguishing it from its larger siblings in the EXAONE family.

  • Multiple quantization options including Q8_0, Q6_K, Q5_K_M, Q4_K_M, and IQ4_XS in GGUF format
  • Extensive vocabulary size of 102,400 tokens
  • Optimized for deployment across various frameworks including TensorRT-LLM, vLLM, and llama.cpp

Core Capabilities

  • Enhanced reasoning abilities for mathematical problems
  • Strong performance in coding tasks
  • Structured thought process with tags
  • Competitive performance against larger models
  • Efficient deployment options across multiple frameworks

Frequently Asked Questions

Q: What makes this model unique?

EXAONE-Deep-2.4B stands out for its exceptional reasoning capabilities despite its relatively small size, outperforming comparable models in its class. The implementation of GQA attention and extensive context window make it particularly effective for complex reasoning tasks.

Q: What are the recommended use cases?

The model excels in mathematical reasoning and coding tasks. It's particularly effective when used with structured prompts that include step-by-step reasoning instructions, especially for math problems where using \boxed{} notation for final answers is recommended. The model performs best when initialized with proper thought processes and minimal system prompts.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.