gpt3-finnish-small

Maintained By
TurkuNLP

gpt3-finnish-small

PropertyValue
Parameter Count186M
ArchitectureBLOOM-based GPT-3
LicenseApache 2.0
Training Tokens300B
LanguageFinnish

What is gpt3-finnish-small?

gpt3-finnish-small is part of TurkuNLP's Finnish GPT-3 model family, specifically designed for Finnish language text generation. As the smallest variant in the series with 186M parameters, it features 12 layers, 768 dimensional embeddings, and 12 attention heads. The model is built on the BLOOM architecture and trained on a comprehensive dataset of 300B tokens from diverse Finnish sources.

Implementation Details

The model is implemented using PyTorch and Transformers, incorporating key architectural elements from the BLOOM framework. It utilizes a carefully curated training dataset that combines multiple Finnish resources, including Internet Parsebank, Common Crawl, Wikipedia, news archives, and social media content, with specific sampling ratios to ensure quality and diversity.

  • 12 transformer layers with 768-dimensional embeddings
  • 12 attention heads for efficient context processing
  • Trained on a weighted combination of sources, with Parsebank (22.7%) and Common Crawl (34.4%) forming the majority
  • Implements standard transformer architecture for autoregressive text generation

Core Capabilities

  • Pure language modeling for Finnish text generation
  • Foundation model suitable for further fine-tuning
  • Text completion and generation tasks
  • Feature extraction for downstream NLP tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Finnish language processing, trained on an extensive and diverse dataset of Finnish text. It's part of a larger family of models that provides varying capacities for different computational requirements.

Q: What are the recommended use cases?

The model is best suited as a foundation model for further fine-tuning on specific tasks. It's important to note that it's not instruction-tuned for dialogue or question-answering out of the box, but rather serves as a base model for such adaptations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.