polyglot-ko-5.8b

Maintained By
EleutherAI

Polyglot-Ko-5.8B

PropertyValue
Parameter Count5.8B
ArchitectureGPT-NeoX
Training Data863GB Korean text
LicenseApache 2.0
PaperTechnical Report

What is polyglot-ko-5.8b?

Polyglot-Ko-5.8B is a large-scale Korean language model developed by EleutherAI's polyglot team. It represents a significant advancement in Korean natural language processing, featuring 5.8 billion parameters and trained on a diverse 863GB dataset of Korean text.

Implementation Details

The model is built on the GPT-NeoX framework and consists of 28 transformer layers with a model dimension of 4096 and 16 attention heads. It was trained for 172 billion tokens over 320,000 steps using 256 A100 GPUs.

  • Model dimension: 4096
  • Attention heads: 16
  • Context length: 2048 tokens
  • Vocabulary size: 30,003 tokens

Core Capabilities

  • State-of-the-art performance on Korean language tasks
  • Excels in COPA (78.87% F1 score with 50-shot learning)
  • Strong performance in sentiment analysis (95.21% F1 score with 50-shot learning)
  • Handles various Korean text generation tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized Korean language capabilities, extensive training on a curated Korean dataset, and superior performance on various Korean NLP tasks compared to similar-sized models.

Q: What are the recommended use cases?

The model is well-suited for Korean text generation, sentiment analysis, question answering, and various downstream NLP tasks. It's particularly effective when used with few-shot learning approaches.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.