Polyglot-Ko-12.8B
Property | Value |
---|---|
Parameter Count | 12.8B |
License | Apache 2.0 |
Paper | Technical Report |
Training Data | 863GB Korean text |
Context Length | 2,048 tokens |
What is polyglot-ko-12.8b?
Polyglot-Ko-12.8B is a state-of-the-art Korean language model developed by EleutherAI's polyglot team. It represents one of the largest and most capable Korean language models available, trained on an extensive dataset of 863GB of Korean text. The model utilizes a transformer architecture with 40 layers and achieves impressive performance across various NLP tasks.
Implementation Details
The model is built with a sophisticated architecture featuring 40 transformer layers, a model dimension of 5120, and 40 attention heads. It employs Rotary Position Embedding (RoPE) for enhanced position understanding and has been trained on diverse Korean language sources including blog posts, news articles, and academic content.
- 40 transformer layers with 5120 model dimension
- 20,480 feedforward dimension
- 40 attention heads with 128 dimensions each
- 2,048 token context length
- 30,003 vocabulary size
Core Capabilities
- State-of-the-art performance on KOBEST benchmark tasks
- Excels in sentiment analysis with 97.23% accuracy (50-shot)
- Strong performance in COPA (83.69%) and HellaSwag (61.18%) tasks
- Advanced Korean text generation and understanding
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its extensive training on Korean-specific data (863GB) and superior performance across multiple benchmarks. It incorporates PII masking for privacy and demonstrates remarkable few-shot learning capabilities.
Q: What are the recommended use cases?
The model excels in Korean text generation, sentiment analysis, and various natural language understanding tasks. It's particularly effective for applications requiring deep understanding of Korean language context and nuances.