gpt2-persian

Maintained By
bolbolzaban

GPT2-Persian

PropertyValue
LicenseApache 2.0
LanguagePersian (Farsi)
FrameworkPyTorch/TensorFlow
Context Window256 tokens

What is gpt2-persian?

GPT2-Persian is a specialized language model based on GPT2-medium architecture, specifically optimized for Persian text generation. Developed by bolbolzaban, this model features a modified architecture with a reduced context window of 256 tokens and employs Google's SentencePiece tokenizer instead of the standard BPE tokenization.

Implementation Details

The model implements several key modifications to the original GPT2 architecture to better serve Persian language processing. It uses special token handling for non-Persian characters, replacing them with designated tokens like [LAT], [URL], and [NUM]. The model also includes specific tokens for Persian poetry generation: [BOM] for beginning of verse and [EOS] for end of statement.

  • Reduced context size (256 tokens) for improved training efficiency
  • Custom tokenization using Google SentencePiece
  • Special token handling for non-Persian characters
  • Poetry-specific tokens for verse structure

Core Capabilities

  • Persian text generation with maintained coherence
  • Classical Persian poetry generation
  • Support for both PyTorch and TensorFlow implementations
  • Efficient processing of Persian-specific character sets

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in Persian language processing, combined with its poetry-focused features and efficient tokenization approach, makes it particularly suitable for Persian content generation. The use of special tokens for non-Persian characters ensures clean, focused Persian text output.

Q: What are the recommended use cases?

The model is particularly well-suited for Persian text generation, especially in research contexts involving Persian poetry. It can be used for creative writing, poetry generation, and general Persian language text completion tasks. The model supports fine-tuning for specific use cases and can be easily integrated into both PyTorch and TensorFlow workflows.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.