pko-t5-large
Property | Value |
---|---|
Parameter Count | 820M |
License | CC-BY-4.0 |
Paper | KLUE Paper |
Architecture | T5 v1.1 |
Tensor Type | F32 |
What is pko-t5-large?
pko-t5-large is a Korean-specific implementation of the T5 (Text-to-Text Transfer Transformer) architecture, specifically trained on Korean language data. This model represents a significant advancement in Korean language processing, utilizing BBPE tokenization instead of traditional SentencePiece to eliminate Out-of-Vocabulary (OOV) issues.
Implementation Details
The model was trained using unsupervised learning on diverse Korean datasets including Namuwiki, Wikipedia, and Modumal Corpus. It employs T5's span corruption task for pre-training and is designed for fine-tuning on specific downstream tasks.
- Built on T5 v1.1 architecture with 820M parameters
- Uses BBPE tokenization optimized for Korean language
- Achieves state-of-the-art performance on multiple KLUE benchmarks
- Supports both single-task and multi-task fine-tuning
Core Capabilities
- Strong performance on KLUE benchmarks (NER: 88.18 F1, RE: 75.17 F1)
- Excellent dependency parsing capabilities (97.60 LAS)
- Robust machine reading comprehension (68.01 EM / 71.44 F1)
- Effective for text classification and natural language inference tasks
Frequently Asked Questions
Q: What makes this model unique?
The model's unique feature is its specialized Korean language optimization using BBPE tokenization and exclusive training on Korean datasets, making it particularly effective for Korean language tasks compared to multilingual models.
Q: What are the recommended use cases?
The model is best suited for fine-tuning on specific Korean language tasks such as named entity recognition, relation extraction, dependency parsing, and machine reading comprehension. It's recommended to fine-tune the model rather than using it as-is.