GPT2-Small-Portuguese
Property | Value |
---|---|
Parameter Count | 124M |
Training Data | Portuguese Wikipedia (1.28 GB) |
License | MIT |
Performance | 37.99% accuracy, 23.76 perplexity |
Model Size | 487M (PyTorch) / 475M (TensorFlow) |
What is gpt2-small-portuguese?
GPT2-small-portuguese is a state-of-the-art language model for Portuguese text generation, fine-tuned from the English GPT-2 small model using transfer learning techniques. Developed by Pierre Guillou, it was trained on Portuguese Wikipedia data in just over a day using a single NVIDIA V100 32GB GPU.
Implementation Details
The model leverages the Hugging Face Transformers library and can be used with both PyTorch and TensorFlow frameworks. It was trained using fastai v2 techniques and achieves impressive performance metrics with relatively limited computational resources.
- Training time: Approximately 30 hours on single GPU
- Architecture: GPT-2 small base model
- Training data: 1.28GB Portuguese Wikipedia corpus
- Validation accuracy: 37.99%
Core Capabilities
- Portuguese text generation
- Next word prediction
- Continuous text completion
- Support for both PyTorch and TensorFlow implementations
- Maximum sequence length of 1024 tokens
Frequently Asked Questions
Q: What makes this model unique?
This model demonstrates that high-quality language models can be created for non-English languages using transfer learning, achieving good results with limited computational resources and training data.
Q: What are the recommended use cases?
The model is best suited for Portuguese text generation tasks, including creative writing, content generation, and text completion. However, users should be aware of potential biases inherited from the training data.