gpt-for-est-base

Maintained By
tartuNLP

gpt-for-est-base

PropertyValue
Parameter Count118.68M
Model TypeGPT2
FrameworkPyTorch 1.10.0
Training Data2.2B words
Context Size1024 tokens

What is gpt-for-est-base?

gpt-for-est-base is a specialized Estonian language model based on the GPT2 architecture, trained from scratch on a diverse corpus of 2.2 billion words. Originally named "gpt-4-est-base," this model represents a significant advancement in Estonian language processing capabilities.

Implementation Details

The model features a sophisticated architecture with 12 layers and 12 attention heads, utilizing a 768-dimensional embedding space. It processes context windows of up to 1024 tokens, making it suitable for various text generation tasks.

  • Custom domain prefixes (>general<, >web<, >news<, >doaj<, >wiki<)
  • Trained on Estonian National Corpus, News Crawl, and Common Crawl
  • Optimized using Transformers 4.13.0.dev0

Core Capabilities

  • Domain-specific text generation
  • Context-aware language processing
  • Support for multiple text domains
  • Estonian language understanding and generation

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on the Estonian language and its domain-specific prefix system, allowing for targeted text generation across different content types. The substantial training corpus of 2.2B words ensures robust language understanding.

Q: What are the recommended use cases?

The model is ideal for Estonian text generation tasks, particularly when domain-specific content is required. It can be used for content generation in news, wiki-style articles, academic abstracts, and general web content, with the appropriate domain prefix.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.