EXAONE-Deep-7.8B

Property	Value
Parameter Count	7.8B (6.98B without embeddings)
Context Length	32,768 tokens
Architecture	32 layers, GQA with 32 Q-heads and 8 KV-heads
License	EXAONE AI Model License Agreement 1.1 - NC
Vocabulary Size	102,400

What is EXAONE-Deep-7.8B?

EXAONE-Deep-7.8B is an advanced language model developed by LG AI Research, specifically designed to excel in reasoning tasks including mathematics and coding. The model represents a significant achievement in balancing size and performance, outperforming both open-weight models of comparable scale and proprietary models like OpenAI's o1-mini.

Implementation Details

The model implements a sophisticated architecture featuring 32 layers with Grouped-Query Attention (GQA), utilizing 32 query heads and 8 key-value heads. With a context length of 32,768 tokens and a vocabulary size of 102,400, it offers robust capabilities for handling complex tasks.

Advanced reasoning capabilities optimized for mathematical and coding problems
Extensive context window supporting long-form reasoning
Efficient architecture with GQA implementation
Comprehensive vocabulary for diverse task handling

Core Capabilities

Achieves 94.8% accuracy on MATH-500 benchmark
70.0% pass rate on AIME 2024 with 83.3% consistency
89.9% accuracy on CSAT Math 2025
Strong performance in coding tasks with 55.2% pass rate on Live Code Bench
Supports various deployment frameworks including TensorRT-LLM, vLLM, and SGLang

Frequently Asked Questions

Q: What makes this model unique?

EXAONE-Deep-7.8B stands out for its exceptional reasoning capabilities despite its moderate size, offering performance that competes with larger models while maintaining efficiency. Its specialized architecture and training make it particularly effective for mathematical and scientific reasoning tasks.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and complex reasoning scenarios. It's particularly well-suited for educational applications, technical problem-solving, and situations requiring detailed step-by-step reasoning. The model performs best when prompts include structured reasoning requests and clear instruction patterns.

EXAONE-Deep-7.8B

EXAONE-Deep-7.8B

What is EXAONE-Deep-7.8B?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models