Polyglot-Ko-1.3B

Property	Value
Parameters	1.3B
Architecture	GPT-NeoX
Training Data	863GB Korean Text
License	Apache 2.0
Paper	Technical Report

What is polyglot-ko-1.3b?

Polyglot-Ko-1.3B is a large-scale Korean language model developed by EleutherAI's polyglot team. It's trained on a diverse 863GB dataset of Korean text, making it one of the most comprehensive Korean language models available. The model employs a transformer architecture with 24 layers, 2048 hidden dimensions, and 16 attention heads.

Implementation Details

The model utilizes the GPT-NeoX framework and was trained on 256 A100 GPUs over 102,000 steps, processing 213 billion tokens. It implements Rotary Position Embedding (RoPE) and has a context window of 2048 tokens.

24 transformer layers with 2048 model dimension
8192 feedforward dimension
16 attention heads with 128 dimensions each
30,003 vocabulary size

Core Capabilities

Strong performance on Korean language understanding tasks
Competitive results on KOBEST benchmark
Built-in PII protection through masking
Suitable for text generation and completion tasks

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its specialized Korean language capabilities and impressive performance despite its relatively modest size. It consistently outperforms similar-sized models and even some larger models on Korean language tasks.

Q: What are the recommended use cases?

The model excels in Korean text generation, sentiment analysis, and various downstream NLP tasks. It's particularly effective for COPA (causality understanding), HellaSwag (common sense reasoning), and SentiNeg (sentiment analysis) tasks.

polyglot-ko-1.3b