KR-ELECTRA-generator

Property	Value
Architecture	ELECTRA Generator
Language	Korean
Embedding Size	768
Number of Layers	12
Vocabulary Size	30,000 tokens
Training Data	34GB Korean texts

What is KR-ELECTRA-generator?

KR-ELECTRA-generator is a specialized Korean language model developed by the Computational Linguistics Lab at Seoul National University. It's built on the ELECTRA architecture and specifically optimized for Korean language processing, showing exceptional performance particularly on informal text analysis like review documents and comments.

Implementation Details

The model follows a base-scale architecture with 12 layers, 768-dimensional embeddings, and 256 hidden units. It was trained on a comprehensive 34GB Korean text dataset using a v3-8 TPU, implementing a morpheme-based tokenization approach using Mecab-Ko analyzer. The training process involved 700,000 steps with a batch size of 256 and a learning rate of 2e-4.

Balanced dataset comprising written and spoken Korean text
Morpheme-based tokenization with 30,000 vocabulary size
4 attention heads in generator configuration
Generator size ratio of 0.33333

Core Capabilities

Superior performance on informal text analysis
Strong results in sentiment analysis (NSMC: 91.168% accuracy)
Excellent performance in question-answering tasks (95.51% accuracy on Question Pair)
Enhanced hate speech detection capabilities (74.50% F1 score)

Frequently Asked Questions

Q: What makes this model unique?

KR-ELECTRA stands out for its balanced training on both formal and informal Korean text, making it particularly effective for real-world applications. It achieves state-of-the-art performance across multiple Korean language tasks, especially in handling informal content like reviews and comments.

Q: What are the recommended use cases?

The model is particularly well-suited for sentiment analysis, question-answering, named entity recognition, and hate speech detection in Korean text. It performs exceptionally well on informal text analysis, making it ideal for social media analysis, review processing, and customer feedback analysis.