gpt2-small-portuguese

Maintained By
pierreguillou

GPT2-Small-Portuguese

PropertyValue
Parameter Count124M
Training DataPortuguese Wikipedia (1.28 GB)
LicenseMIT
Performance37.99% accuracy, 23.76 perplexity
Model Size487M (PyTorch) / 475M (TensorFlow)

What is gpt2-small-portuguese?

GPT2-small-portuguese is a state-of-the-art language model for Portuguese text generation, fine-tuned from the English GPT-2 small model using transfer learning techniques. Developed by Pierre Guillou, it was trained on Portuguese Wikipedia data in just over a day using a single NVIDIA V100 32GB GPU.

Implementation Details

The model leverages the Hugging Face Transformers library and can be used with both PyTorch and TensorFlow frameworks. It was trained using fastai v2 techniques and achieves impressive performance metrics with relatively limited computational resources.

  • Training time: Approximately 30 hours on single GPU
  • Architecture: GPT-2 small base model
  • Training data: 1.28GB Portuguese Wikipedia corpus
  • Validation accuracy: 37.99%

Core Capabilities

  • Portuguese text generation
  • Next word prediction
  • Continuous text completion
  • Support for both PyTorch and TensorFlow implementations
  • Maximum sequence length of 1024 tokens

Frequently Asked Questions

Q: What makes this model unique?

This model demonstrates that high-quality language models can be created for non-English languages using transfer learning, achieving good results with limited computational resources and training data.

Q: What are the recommended use cases?

The model is best suited for Portuguese text generation tasks, including creative writing, content generation, and text completion. However, users should be aware of potential biases inherited from the training data.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.