pko-t5-base

Property	Value
Parameter Count	250M
Model Type	T5 v1.1 Variant
License	MIT
Author	PAUST
Model URL	https://huggingface.co/paust/pko-t5-base

What is pko-t5-base?

pko-t5-base is a Korean-specific adaptation of the T5 v1.1 architecture, specifically trained on Korean language data including Namuwiki, Wikipedia, and Modern Korean Corpus. Unlike traditional T5 models, it employs BBPE tokenization instead of SentencePiece to eliminate Out-of-Vocabulary (OOV) issues in Korean text processing.

Implementation Details

The model utilizes unsupervised learning through T5's span corruption task, focusing exclusively on Korean language data. It's implemented using the Hugging Face Transformers library and requires the T5TokenizerFast tokenizer for optimal performance.

Architecture: T5 v1.1 with BBPE tokenization
Training Data: Korean-specific datasets (Namuwiki, Wikipedia, Modern Korean Corpus)
Training Method: Unsupervised learning with span corruption
Model Size: 250M parameters (base version)

Core Capabilities

Strong performance on KLUE benchmark tasks
Achieves 87.29% F1 score on YNAT after fine-tuning
97.28% LAS score on dependency parsing tasks
61.53% EM score on MRC tasks
Supports both single-task and multi-task fine-tuning

Frequently Asked Questions

Q: What makes this model unique?

The model's use of BBPE tokenization instead of SentencePiece makes it particularly effective for Korean text processing, eliminating OOV issues common in Korean language models. It's specifically optimized for Korean language tasks through extensive pre-training on Korean-specific datasets.

Q: What are the recommended use cases?

The model is designed for fine-tuning on specific Korean language tasks. It performs particularly well on KLUE benchmark tasks including text classification, named entity recognition, semantic textual similarity, and machine reading comprehension after task-specific fine-tuning.

pko-t5-base

pko-t5-base

What is pko-t5-base?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models