EXAONE-3.5-7.8B-Instruct
Property | Value |
---|---|
Parameters | 6.98B (without embeddings) |
Context Length | 32,768 tokens |
Architecture | 32 layers, GQA with 32 Q-heads and 8 KV-heads |
Vocabulary Size | 102,400 |
License | EXAONE AI Model License Agreement 1.1 - NC |
What is EXAONE-3.5-7.8B-Instruct?
EXAONE-3.5-7.8B-Instruct is part of the EXAONE 3.5 collection, a sophisticated bilingual language model developed by LG AI Research. This particular variant represents a balanced approach between computational efficiency and performance, featuring 6.98B parameters and supporting an impressive 32K token context window. The model excels in both English and Korean language processing, making it particularly valuable for bilingual applications.
Implementation Details
The model implements a state-of-the-art architecture utilizing Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads across 32 layers. It employs a large vocabulary of 102,400 tokens and can be deployed using various frameworks including TensorRT-LLM, vLLM, SGLang, llama.cpp, and Ollama.
- Advanced bilingual capabilities with state-of-the-art performance
- Extensive context window of 32K tokens
- Optimized architecture with GQA attention mechanism
- Multiple deployment options and quantization support
Core Capabilities
- Strong performance in MT-Bench (8.29) and LiveBench (39.8)
- Excellent bilingual understanding and generation
- Long-context processing and comprehension
- Real-world use case optimization
- Support for various deployment frameworks
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its bilingual capabilities, extensive context window, and competitive performance metrics across various benchmarks. It outperforms similar-sized models in real-world applications while maintaining efficiency.
Q: What are the recommended use cases?
The model is particularly well-suited for bilingual applications requiring English and Korean language processing, long-context understanding, and deployment scenarios where balance between performance and resource efficiency is crucial.