t5-base-nl36-finnish

Maintained By
Finnish-NLP

t5-base-nl36-finnish

PropertyValue
Parameter Count814 million
Model TypeT5 Encoder-Decoder
ArchitectureDeep-Narrow with 36 transformer layers
Training Data76GB of cleaned Finnish text
Vocabulary Size32,000 tokens

What is t5-base-nl36-finnish?

t5-base-nl36-finnish is a powerful Finnish language model based on the T5 architecture, specifically designed for Finnish language processing tasks. It represents a significant advancement in Finnish NLP, featuring 814 million parameters and trained on a diverse collection of Finnish text data including news archives, Wikipedia, and web crawl content. The model implements the efficient Deep-Narrow architecture with 36 transformer layers, making it particularly effective for downstream tasks after fine-tuning.

Implementation Details

The model was trained using the T5 v1.1 improvements on TPUv3-8 VM for 1M steps with a batch size of 64, processing approximately 33B tokens. It employs span-based masked language modeling with unique features such as GEGLU activation and no parameter sharing between embedding and classifier layers.

  • Uses WordPiece tokenization with 32,000 vocabulary size
  • Implements case-sensitive processing
  • Trained with AdaFactor optimizer and learning rate warmup
  • Sequences of 512 consecutive tokens for inputs and outputs

Core Capabilities

  • Achieves 94.40% accuracy on Yle News classification after fine-tuning
  • Performs 75.97% accuracy on Eduskunta classification tasks
  • Outperforms multilingual mT5 models on Finnish text classification
  • Supports various downstream NLP tasks through fine-tuning

Frequently Asked Questions

Q: What makes this model unique?

The model's Deep-Narrow architecture with 36 transformer layers and exclusive focus on Finnish language processing makes it particularly effective for Finnish NLP tasks. It significantly outperforms multilingual models on Finnish-specific tasks.

Q: What are the recommended use cases?

The model requires task-specific fine-tuning before use. It's particularly suitable for Finnish text classification, text generation, and other NLP tasks. However, it should be fine-tuned with full fp32 precision for optimal results.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.