Polyglot-Ko-5.8B
Property | Value |
---|---|
Parameter Count | 5.8B |
Architecture | GPT-NeoX |
Training Data | 863GB Korean text |
License | Apache 2.0 |
Paper | Technical Report |
What is polyglot-ko-5.8b?
Polyglot-Ko-5.8B is a large-scale Korean language model developed by EleutherAI's polyglot team. It represents a significant advancement in Korean natural language processing, featuring 5.8 billion parameters and trained on a diverse 863GB dataset of Korean text.
Implementation Details
The model is built on the GPT-NeoX framework and consists of 28 transformer layers with a model dimension of 4096 and 16 attention heads. It was trained for 172 billion tokens over 320,000 steps using 256 A100 GPUs.
- Model dimension: 4096
- Attention heads: 16
- Context length: 2048 tokens
- Vocabulary size: 30,003 tokens
Core Capabilities
- State-of-the-art performance on Korean language tasks
- Excels in COPA (78.87% F1 score with 50-shot learning)
- Strong performance in sentiment analysis (95.21% F1 score with 50-shot learning)
- Handles various Korean text generation tasks
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized Korean language capabilities, extensive training on a curated Korean dataset, and superior performance on various Korean NLP tasks compared to similar-sized models.
Q: What are the recommended use cases?
The model is well-suited for Korean text generation, sentiment analysis, question answering, and various downstream NLP tasks. It's particularly effective when used with few-shot learning approaches.