GPT2-Small-Portuguese

Property	Value
Parameter Count	124M
Training Data	Portuguese Wikipedia (1.28 GB)
License	MIT
Performance	37.99% accuracy, 23.76 perplexity
Model Size	487M (PyTorch) / 475M (TensorFlow)

What is gpt2-small-portuguese?

GPT2-small-portuguese is a state-of-the-art language model for Portuguese text generation, fine-tuned from the English GPT-2 small model using transfer learning techniques. Developed by Pierre Guillou, it was trained on Portuguese Wikipedia data in just over a day using a single NVIDIA V100 32GB GPU.

Implementation Details

The model leverages the Hugging Face Transformers library and can be used with both PyTorch and TensorFlow frameworks. It was trained using fastai v2 techniques and achieves impressive performance metrics with relatively limited computational resources.

Training time: Approximately 30 hours on single GPU
Architecture: GPT-2 small base model
Training data: 1.28GB Portuguese Wikipedia corpus
Validation accuracy: 37.99%

Core Capabilities

Portuguese text generation
Next word prediction
Continuous text completion
Support for both PyTorch and TensorFlow implementations
Maximum sequence length of 1024 tokens

Frequently Asked Questions

Q: What makes this model unique?

This model demonstrates that high-quality language models can be created for non-English languages using transfer learning, achieving good results with limited computational resources and training data.

Q: What are the recommended use cases?

The model is best suited for Portuguese text generation tasks, including creative writing, content generation, and text completion. However, users should be aware of potential biases inherited from the training data.