t5-base-nl36-finnish
Property | Value |
---|---|
Parameter Count | 814 million |
Model Type | T5 Encoder-Decoder |
Architecture | Deep-Narrow with 36 transformer layers |
Training Data | 76GB of cleaned Finnish text |
Vocabulary Size | 32,000 tokens |
What is t5-base-nl36-finnish?
t5-base-nl36-finnish is a powerful Finnish language model based on the T5 architecture, specifically designed for Finnish language processing tasks. It represents a significant advancement in Finnish NLP, featuring 814 million parameters and trained on a diverse collection of Finnish text data including news archives, Wikipedia, and web crawl content. The model implements the efficient Deep-Narrow architecture with 36 transformer layers, making it particularly effective for downstream tasks after fine-tuning.
Implementation Details
The model was trained using the T5 v1.1 improvements on TPUv3-8 VM for 1M steps with a batch size of 64, processing approximately 33B tokens. It employs span-based masked language modeling with unique features such as GEGLU activation and no parameter sharing between embedding and classifier layers.
- Uses WordPiece tokenization with 32,000 vocabulary size
- Implements case-sensitive processing
- Trained with AdaFactor optimizer and learning rate warmup
- Sequences of 512 consecutive tokens for inputs and outputs
Core Capabilities
- Achieves 94.40% accuracy on Yle News classification after fine-tuning
- Performs 75.97% accuracy on Eduskunta classification tasks
- Outperforms multilingual mT5 models on Finnish text classification
- Supports various downstream NLP tasks through fine-tuning
Frequently Asked Questions
Q: What makes this model unique?
The model's Deep-Narrow architecture with 36 transformer layers and exclusive focus on Finnish language processing makes it particularly effective for Finnish NLP tasks. It significantly outperforms multilingual models on Finnish-specific tasks.
Q: What are the recommended use cases?
The model requires task-specific fine-tuning before use. It's particularly suitable for Finnish text classification, text generation, and other NLP tasks. However, it should be fine-tuned with full fp32 precision for optimal results.