roberta-large-finnish

roberta-large-finnish

Finnish-NLP

RoBERTa large model for Finnish language processing, trained on 78GB text data including news and web content. Optimized for MLM tasks and downstream fine-tuning.

PropertyValue
Model TypeRoBERTa Large
Training Data78GB Finnish text
TokenizerBPE (50265 vocab size)
DeveloperFinnish-NLP

What is roberta-large-finnish?

roberta-large-finnish is a powerful Finnish language model based on the RoBERTa architecture, trained on a diverse corpus of Finnish text including news archives, Wikipedia, and web crawl data. The model specializes in masked language modeling (MLM) and is designed for downstream task fine-tuning.

Implementation Details

The model was trained on TPUv3-8 VM using Adafactor optimizer with carefully tuned parameters. Training involved 2 epochs with 128 sequence length followed by an additional epoch with 512 sequence length. The model implements dynamic masking during pretraining, making it more robust than traditional BERT implementations.

  • Trained on combined 78GB of cleaned Finnish text data
  • Uses byte-level BPE tokenization with 50265 vocabulary size
  • Implements 15% token masking with varied replacement strategies
  • Achieves 88.02% average accuracy on downstream tasks

Core Capabilities

  • Masked language modeling for Finnish text
  • Sequence classification tasks
  • Token classification
  • Question answering
  • Feature extraction for downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Finnish language processing, trained on a comprehensive dataset of Finnish text. It improves upon previous Finnish language models and approaches the performance of FinBERT while using the more advanced RoBERTa architecture.

Q: What are the recommended use cases?

The model is primarily designed for tasks that utilize whole sentence context, such as sequence classification, token classification, and question answering. It's not recommended for text generation tasks, where models like GPT2 would be more appropriate.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026