EXAONE-3.0-7.8B-Instruct-AWQ
Property | Value |
---|---|
Parameter Count | 7.8B |
Quantization | 4-bit AWQ |
Languages | English, Korean |
License | EXAONE AI Model License Agreement 1.1 - NC |
Paper | arXiv:2408.03541 |
What is EXAONE-3.0-7.8B-Instruct-AWQ?
EXAONE-3.0-7.8B-Instruct-AWQ is a quantized version of the original EXAONE-3.0-7.8B-Instruct model, developed by LG AI Research. This version employs AWQ (Activation-aware Weight Quantization) technique for group-wise 4-bit weight-only quantization (W4A16g128), significantly reducing the model's memory footprint while maintaining performance.
Implementation Details
The model utilizes advanced quantization techniques to compress the original model while preserving its capabilities. It requires specific implementations of transformers and AWQ libraries from LG AI's custom forks for optimal performance.
- Implements group-wise 4-bit weight quantization
- Maintains 16-bit activations for balanced performance
- Requires custom transformer and AWQ library implementations
- Supports both English and Korean language processing
Core Capabilities
- Bilingual instruction following (English and Korean)
- Efficient memory usage through quantization
- System prompt awareness for enhanced context understanding
- Structured dialogue handling through chat templates
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient quantization approach while maintaining the capabilities of the original 7.8B parameter model, making it more accessible for deployment in resource-constrained environments.
Q: What are the recommended use cases?
The model is well-suited for bilingual applications requiring instruction following and dialogue generation, particularly in scenarios where memory efficiency is crucial. It's specifically designed to work with system prompts for enhanced contextual understanding.