EXAONE-Deep-2.4B
Property | Value |
---|---|
Parameters | 2.14B |
Context Length | 32,768 tokens |
Layers | 30 |
Attention Heads | GQA with 32 Q-heads and 8 KV-heads |
Vocabulary Size | 102,400 |
License | EXAONE AI Model License Agreement 1.1 - NC |
What is EXAONE-Deep-2.4B?
EXAONE-Deep-2.4B is an advanced language model developed by LG AI Research, specifically designed for superior reasoning capabilities in mathematics and coding tasks. As part of the EXAONE Deep family, this model represents a significant achievement in balancing model size with performance, outperforming other models of comparable scale.
Implementation Details
The model features a sophisticated architecture with 30 layers and employs Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads. It supports an impressive context length of 32,768 tokens and includes tied word embeddings, distinguishing it from its larger siblings in the EXAONE family.
- Advanced reasoning capabilities optimized for math and coding tasks
- Efficient architecture with GQA attention mechanism
- Extensive vocabulary of 102,400 tokens
- Support for bfloat16 precision
Core Capabilities
- Exceptional performance on MATH-500 benchmark with 92.3% pass@1
- Strong results on AIME mathematics competitions
- Competitive performance on CSAT Math and GPQA Diamond tests
- Robust code generation abilities with 46.6% pass rate on Live Code Bench
Frequently Asked Questions
Q: What makes this model unique?
EXAONE-Deep-2.4B stands out for its exceptional reasoning capabilities despite its relatively compact size. It achieves performance levels that compete with or exceed larger models, particularly in mathematical reasoning and coding tasks.
Q: What are the recommended use cases?
The model excels in mathematical problem-solving, coding tasks, and general reasoning applications. It's particularly well-suited for educational applications, automated mathematics assistance, and code generation tasks that require strong logical reasoning.