EXAONE-3.0-7.8B-Instruct-AWQ

Property	Value
Parameter Count	7.8B
Quantization	4-bit AWQ
Languages	English, Korean
License	EXAONE AI Model License Agreement 1.1 - NC
Paper	arXiv:2408.03541

What is EXAONE-3.0-7.8B-Instruct-AWQ?

EXAONE-3.0-7.8B-Instruct-AWQ is a quantized version of the original EXAONE-3.0-7.8B-Instruct model, developed by LG AI Research. This version employs AWQ (Activation-aware Weight Quantization) technique for group-wise 4-bit weight-only quantization (W4A16g128), significantly reducing the model's memory footprint while maintaining performance.

Implementation Details

The model utilizes advanced quantization techniques to compress the original model while preserving its capabilities. It requires specific implementations of transformers and AWQ libraries from LG AI's custom forks for optimal performance.

Implements group-wise 4-bit weight quantization
Maintains 16-bit activations for balanced performance
Requires custom transformer and AWQ library implementations
Supports both English and Korean language processing

Core Capabilities

Bilingual instruction following (English and Korean)
Efficient memory usage through quantization
System prompt awareness for enhanced context understanding
Structured dialogue handling through chat templates

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient quantization approach while maintaining the capabilities of the original 7.8B parameter model, making it more accessible for deployment in resource-constrained environments.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications requiring instruction following and dialogue generation, particularly in scenarios where memory efficiency is crucial. It's specifically designed to work with system prompts for enhanced contextual understanding.