polyglot-ko-1.3b

Maintained By
EleutherAI

Polyglot-Ko-1.3B

PropertyValue
Parameters1.3B
ArchitectureGPT-NeoX
Training Data863GB Korean Text
LicenseApache 2.0
PaperTechnical Report

What is polyglot-ko-1.3b?

Polyglot-Ko-1.3B is a large-scale Korean language model developed by EleutherAI's polyglot team. It's trained on a diverse 863GB dataset of Korean text, making it one of the most comprehensive Korean language models available. The model employs a transformer architecture with 24 layers, 2048 hidden dimensions, and 16 attention heads.

Implementation Details

The model utilizes the GPT-NeoX framework and was trained on 256 A100 GPUs over 102,000 steps, processing 213 billion tokens. It implements Rotary Position Embedding (RoPE) and has a context window of 2048 tokens.

  • 24 transformer layers with 2048 model dimension
  • 8192 feedforward dimension
  • 16 attention heads with 128 dimensions each
  • 30,003 vocabulary size

Core Capabilities

  • Strong performance on Korean language understanding tasks
  • Competitive results on KOBEST benchmark
  • Built-in PII protection through masking
  • Suitable for text generation and completion tasks

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its specialized Korean language capabilities and impressive performance despite its relatively modest size. It consistently outperforms similar-sized models and even some larger models on Korean language tasks.

Q: What are the recommended use cases?

The model excels in Korean text generation, sentiment analysis, and various downstream NLP tasks. It's particularly effective for COPA (causality understanding), HellaSwag (common sense reasoning), and SentiNeg (sentiment analysis) tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.