EXAONE-Deep-2.4B

LGAI-EXAONE

EXAONE-Deep-2.4B is a powerful 2.14B parameter language model excelling in reasoning tasks, featuring 32K context length and superior math/coding capabilities.

Property	Value
Parameters	2.14B
Context Length	32,768 tokens
Layers	30
Attention Heads	GQA with 32 Q-heads and 8 KV-heads
Vocabulary Size	102,400
License	EXAONE AI Model License Agreement 1.1 - NC

What is EXAONE-Deep-2.4B?

EXAONE-Deep-2.4B is an advanced language model developed by LG AI Research, specifically designed for superior reasoning capabilities in mathematics and coding tasks. As part of the EXAONE Deep family, this model represents a significant achievement in balancing model size with performance, outperforming other models of comparable scale.

Implementation Details

The model features a sophisticated architecture with 30 layers and employs Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads. It supports an impressive context length of 32,768 tokens and includes tied word embeddings, distinguishing it from its larger siblings in the EXAONE family.

Advanced reasoning capabilities optimized for math and coding tasks
Efficient architecture with GQA attention mechanism
Extensive vocabulary of 102,400 tokens
Support for bfloat16 precision

Core Capabilities

Exceptional performance on MATH-500 benchmark with 92.3% pass@1
Strong results on AIME mathematics competitions
Competitive performance on CSAT Math and GPQA Diamond tests
Robust code generation abilities with 46.6% pass rate on Live Code Bench

Frequently Asked Questions

Q: What makes this model unique?

EXAONE-Deep-2.4B stands out for its exceptional reasoning capabilities despite its relatively compact size. It achieves performance levels that compete with or exceed larger models, particularly in mathematical reasoning and coding tasks.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and general reasoning applications. It's particularly well-suited for educational applications, automated mathematics assistance, and code generation tasks that require strong logical reasoning.