gpt3-finnish-small

Property	Value
Parameter Count	186M
Architecture	BLOOM-based GPT-3
License	Apache 2.0
Training Tokens	300B
Language	Finnish

What is gpt3-finnish-small?

gpt3-finnish-small is part of TurkuNLP's Finnish GPT-3 model family, specifically designed for Finnish language text generation. As the smallest variant in the series with 186M parameters, it features 12 layers, 768 dimensional embeddings, and 12 attention heads. The model is built on the BLOOM architecture and trained on a comprehensive dataset of 300B tokens from diverse Finnish sources.

Implementation Details

The model is implemented using PyTorch and Transformers, incorporating key architectural elements from the BLOOM framework. It utilizes a carefully curated training dataset that combines multiple Finnish resources, including Internet Parsebank, Common Crawl, Wikipedia, news archives, and social media content, with specific sampling ratios to ensure quality and diversity.

12 transformer layers with 768-dimensional embeddings
12 attention heads for efficient context processing
Trained on a weighted combination of sources, with Parsebank (22.7%) and Common Crawl (34.4%) forming the majority
Implements standard transformer architecture for autoregressive text generation

Core Capabilities

Pure language modeling for Finnish text generation
Foundation model suitable for further fine-tuning
Text completion and generation tasks
Feature extraction for downstream NLP tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Finnish language processing, trained on an extensive and diverse dataset of Finnish text. It's part of a larger family of models that provides varying capacities for different computational requirements.

Q: What are the recommended use cases?

The model is best suited as a foundation model for further fine-tuning on specific tasks. It's important to note that it's not instruction-tuned for dialogue or question-answering out of the box, but rather serves as a base model for such adaptations.