Polyglot-Ko-5.8B

Property	Value
Parameter Count	5.8B
Architecture	GPT-NeoX
Training Data	863GB Korean text
License	Apache 2.0
Paper	Technical Report

What is polyglot-ko-5.8b?

Polyglot-Ko-5.8B is a large-scale Korean language model developed by EleutherAI's polyglot team. It represents a significant advancement in Korean natural language processing, featuring 5.8 billion parameters and trained on a diverse 863GB dataset of Korean text.

Implementation Details

The model is built on the GPT-NeoX framework and consists of 28 transformer layers with a model dimension of 4096 and 16 attention heads. It was trained for 172 billion tokens over 320,000 steps using 256 A100 GPUs.

Model dimension: 4096
Attention heads: 16
Context length: 2048 tokens
Vocabulary size: 30,003 tokens

Core Capabilities

State-of-the-art performance on Korean language tasks
Excels in COPA (78.87% F1 score with 50-shot learning)
Strong performance in sentiment analysis (95.21% F1 score with 50-shot learning)
Handles various Korean text generation tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized Korean language capabilities, extensive training on a curated Korean dataset, and superior performance on various Korean NLP tasks compared to similar-sized models.

Q: What are the recommended use cases?

The model is well-suited for Korean text generation, sentiment analysis, question answering, and various downstream NLP tasks. It's particularly effective when used with few-shot learning approaches.

polyglot-ko-5.8b