Polyglot-Ko-12.8B

Property	Value
Parameter Count	12.8B
License	Apache 2.0
Paper	Technical Report
Training Data	863GB Korean text
Context Length	2,048 tokens

What is polyglot-ko-12.8b?

Polyglot-Ko-12.8B is a state-of-the-art Korean language model developed by EleutherAI's polyglot team. It represents one of the largest and most capable Korean language models available, trained on an extensive dataset of 863GB of Korean text. The model utilizes a transformer architecture with 40 layers and achieves impressive performance across various NLP tasks.

Implementation Details

The model is built with a sophisticated architecture featuring 40 transformer layers, a model dimension of 5120, and 40 attention heads. It employs Rotary Position Embedding (RoPE) for enhanced position understanding and has been trained on diverse Korean language sources including blog posts, news articles, and academic content.

40 transformer layers with 5120 model dimension
20,480 feedforward dimension
40 attention heads with 128 dimensions each
2,048 token context length
30,003 vocabulary size

Core Capabilities

State-of-the-art performance on KOBEST benchmark tasks
Excels in sentiment analysis with 97.23% accuracy (50-shot)
Strong performance in COPA (83.69%) and HellaSwag (61.18%) tasks
Advanced Korean text generation and understanding

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extensive training on Korean-specific data (863GB) and superior performance across multiple benchmarks. It incorporates PII masking for privacy and demonstrates remarkable few-shot learning capabilities.

Q: What are the recommended use cases?

The model excels in Korean text generation, sentiment analysis, and various natural language understanding tasks. It's particularly effective for applications requiring deep understanding of Korean language context and nuances.

polyglot-ko-12.8b