EXAONE-3.5-7.8B-Instruct

Maintained By
LGAI-EXAONE

EXAONE-3.5-7.8B-Instruct

PropertyValue
Parameters6.98B (without embeddings)
Context Length32,768 tokens
Architecture32 layers, GQA with 32 Q-heads and 8 KV-heads
Vocabulary Size102,400
LicenseEXAONE AI Model License Agreement 1.1 - NC

What is EXAONE-3.5-7.8B-Instruct?

EXAONE-3.5-7.8B-Instruct is part of the EXAONE 3.5 collection, a sophisticated bilingual language model developed by LG AI Research. This particular variant represents a balanced approach between computational efficiency and performance, featuring 6.98B parameters and supporting an impressive 32K token context window. The model excels in both English and Korean language processing, making it particularly valuable for bilingual applications.

Implementation Details

The model implements a state-of-the-art architecture utilizing Grouped-Query Attention (GQA) with 32 query heads and 8 key-value heads across 32 layers. It employs a large vocabulary of 102,400 tokens and can be deployed using various frameworks including TensorRT-LLM, vLLM, SGLang, llama.cpp, and Ollama.

  • Advanced bilingual capabilities with state-of-the-art performance
  • Extensive context window of 32K tokens
  • Optimized architecture with GQA attention mechanism
  • Multiple deployment options and quantization support

Core Capabilities

  • Strong performance in MT-Bench (8.29) and LiveBench (39.8)
  • Excellent bilingual understanding and generation
  • Long-context processing and comprehension
  • Real-world use case optimization
  • Support for various deployment frameworks

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its bilingual capabilities, extensive context window, and competitive performance metrics across various benchmarks. It outperforms similar-sized models in real-world applications while maintaining efficiency.

Q: What are the recommended use cases?

The model is particularly well-suited for bilingual applications requiring English and Korean language processing, long-context understanding, and deployment scenarios where balance between performance and resource efficiency is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.