papuGaPT2

papuGaPT2

flax-community

Polish GPT2 language model trained on Oscar corpus for text generation. Features advanced text generation capabilities with perplexity of 21.79. Supports zero/few-shot learning.

PropertyValue
LanguagePolish
Training DataOscar Corpus (Polish subset)
Evaluation Perplexity21.79
Authorflax-community

What is papuGaPT2?

papuGaPT2 is a Polish language GPT2 model designed to bring advanced text generation capabilities to the Polish NLP community. Built on the standard GPT2 architecture, it was trained using a causal language modeling approach on the Polish subset of the multilingual Oscar corpus. The model achieved an impressive evaluation perplexity of 21.79, making it a powerful tool for Polish text generation tasks.

Implementation Details

The model uses a byte-level version of Byte Pair Encoding for tokenization with a vocabulary size of 50,257. Training was conducted on a TPUv3 VM in three phases, with varying learning rates and batch sizes. The final training resulted in an evaluation loss of 3.082.

  • Tokenization: Byte-level BPE with 50,257 vocab size
  • Input sequences: 512 consecutive tokens
  • Training infrastructure: TPUv3 VM
  • Training phases: 3 distinct phases with different learning rates

Core Capabilities

  • Text generation with multiple decoding methods (greedy, beam search, sampling)
  • Support for top-k and top-p sampling
  • Zero-shot and few-shot learning capabilities
  • Bad words filtering functionality
  • Context-aware text completion

Frequently Asked Questions

Q: What makes this model unique?

This is one of the first strong text generation models specifically trained for the Polish language, filling a crucial gap in Polish NLP research. Its performance and versatility make it particularly valuable for Polish language processing tasks.

Q: What are the recommended use cases?

The model is primarily recommended for research purposes due to potential biases in the training data. It can be used for text generation, feature extraction, or fine-tuning for downstream tasks. However, users should be aware of and account for potential biases, particularly regarding gender and ethnicity.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026