Polyglot-Ko-5.8B
| Property | Value |
|---|---|
| Parameter Count | 5.8B |
| Architecture | GPT-NeoX |
| Training Data | 863GB Korean text |
| License | Apache 2.0 |
| Paper | Technical Report |
What is polyglot-ko-5.8b?
Polyglot-Ko-5.8B is a large-scale Korean language model developed by EleutherAI's polyglot team. It represents a significant advancement in Korean natural language processing, featuring 5.8 billion parameters and trained on a diverse 863GB dataset of Korean text.
Implementation Details
The model is built on the GPT-NeoX framework and consists of 28 transformer layers with a model dimension of 4096 and 16 attention heads. It was trained for 172 billion tokens over 320,000 steps using 256 A100 GPUs.
- Model dimension: 4096
- Attention heads: 16
- Context length: 2048 tokens
- Vocabulary size: 30,003 tokens
Core Capabilities
- State-of-the-art performance on Korean language tasks
- Excels in COPA (78.87% F1 score with 50-shot learning)
- Strong performance in sentiment analysis (95.21% F1 score with 50-shot learning)
- Handles various Korean text generation tasks
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized Korean language capabilities, extensive training on a curated Korean dataset, and superior performance on various Korean NLP tasks compared to similar-sized models.
Q: What are the recommended use cases?
The model is well-suited for Korean text generation, sentiment analysis, question answering, and various downstream NLP tasks. It's particularly effective when used with few-shot learning approaches.





