t5-base-japanese

Maintained By
sonoisa

t5-base-japanese

PropertyValue
Parameters222M
LicenseCC-BY SA 4.0
Training DataWikipedia, OSCAR, CC-100
FrameworkPyTorch

What is t5-base-japanese?

t5-base-japanese is a specialized Text-to-Text Transfer Transformer (T5) model pre-trained specifically for Japanese language tasks. Developed by sonoisa, this model leverages approximately 100GB of Japanese text from diverse sources including Wikipedia, OSCAR corpus, and CC-100 dataset. The model demonstrates superior performance compared to multilingual alternatives, particularly in tasks like news classification.

Implementation Details

The model utilizes a SentencePiece tokenizer trained on the complete Japanese Wikipedia dataset. With 222M parameters, it's 25% smaller than Google's mT5-small while achieving better performance. The model requires fine-tuning for specific downstream tasks but provides strong baseline performance.

  • Pre-trained on 100GB of Japanese text
  • Achieves 97% accuracy on livedoor news classification
  • JSQuAD performance: EM=0.900, F1=0.945
  • Implements T5 architecture with Japanese-specific optimizations

Core Capabilities

  • Text classification with high accuracy (97% on news classification)
  • Question answering (JSQuAD benchmark)
  • Text generation and sequence-to-sequence tasks
  • Feature extraction for Japanese text

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized Japanese language capabilities and improved efficiency, offering better performance than multilingual alternatives with a smaller parameter count. It's particularly notable for achieving 6 percentage points higher accuracy than mT5 on news classification tasks.

Q: What are the recommended use cases?

The model is well-suited for Japanese text classification, question answering, and sequence-to-sequence tasks. However, it requires task-specific fine-tuning before deployment. Users should be aware of potential biases in the training data and ensure ethical usage.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.