t5-base-qa-squad-v1.1-portuguese
Property | Value |
---|---|
Task Type | Question Answering |
Base Model | unicamp-dl/ptt5-base-portuguese-vocab |
Training Dataset | SQuAD v1.1 Portuguese |
F1 Score | 79.3 |
Exact Match | 67.3983 |
What is t5-base-qa-squad-v1.1-portuguese?
This is a specialized Question Answering (QA) model fine-tuned for Portuguese language applications. Built on the T5 base architecture, it was specifically trained on the Portuguese version of the SQuAD v1.1 dataset by the Deep Learning Brasil group. The model demonstrates strong performance with a 79.3 F1 score and 67.4% exact match accuracy on validation data.
Implementation Details
The model was fine-tuned using a Text2Text-Generation objective with carefully selected hyperparameters including a learning rate of 1e-4, batch size of 4, and gradient accumulation steps of 3. Training was conducted over 10 epochs on Google Colab, with optimization focused on maximizing F1 score performance.
- Utilizes beam search with early stopping for generation
- Supports maximum target length of 32 tokens
- Implements efficient batch processing
- Optimized for production deployment
Core Capabilities
- Portuguese language question answering
- Context-based answer extraction
- Production-ready inference pipeline
- Efficient text generation with beam search
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized optimization for Portuguese QA tasks, combining the powerful T5 architecture with careful fine-tuning on the SQuAD dataset. Its balance of performance and efficiency makes it particularly suitable for production deployments.
Q: What are the recommended use cases?
The model is ideal for Portuguese language applications requiring automated question answering capabilities, including customer service automation, information extraction from documents, and educational tools. It performs best with well-structured context and clear questions.