GPT2-Persian

Property	Value
License	Apache 2.0
Language	Persian (Farsi)
Framework	PyTorch/TensorFlow
Context Window	256 tokens

What is gpt2-persian?

GPT2-Persian is a specialized language model based on GPT2-medium architecture, specifically optimized for Persian text generation. Developed by bolbolzaban, this model features a modified architecture with a reduced context window of 256 tokens and employs Google's SentencePiece tokenizer instead of the standard BPE tokenization.

Implementation Details

The model implements several key modifications to the original GPT2 architecture to better serve Persian language processing. It uses special token handling for non-Persian characters, replacing them with designated tokens like [LAT], [URL], and [NUM]. The model also includes specific tokens for Persian poetry generation: [BOM] for beginning of verse and [EOS] for end of statement.

Reduced context size (256 tokens) for improved training efficiency
Custom tokenization using Google SentencePiece
Special token handling for non-Persian characters
Poetry-specific tokens for verse structure

Core Capabilities

Persian text generation with maintained coherence
Classical Persian poetry generation
Support for both PyTorch and TensorFlow implementations
Efficient processing of Persian-specific character sets

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in Persian language processing, combined with its poetry-focused features and efficient tokenization approach, makes it particularly suitable for Persian content generation. The use of special tokens for non-Persian characters ensures clean, focused Persian text output.

Q: What are the recommended use cases?

The model is particularly well-suited for Persian text generation, especially in research contexts involving Persian poetry. It can be used for creative writing, poetry generation, and general Persian language text completion tasks. The model supports fine-tuning for specific use cases and can be easily integrated into both PyTorch and TensorFlow workflows.

gpt2-persian