BETO Spanish BERT for Question Answering
Property | Value |
---|---|
Author | mrm8488 |
Downloads | 33,181 |
Task Type | Question Answering |
Performance (F1) | 86.07% |
What is bert-base-spanish-wwm-cased-finetuned-spa-squad2-es?
This is a specialized Spanish language model based on BETO (Spanish BERT) that has been fine-tuned specifically for question-answering tasks using the Spanish version of SQuAD2.0 dataset. The model utilizes whole word masking technique and maintains case sensitivity for enhanced performance in Spanish text understanding.
Implementation Details
The model was trained on a Tesla P100 GPU using the transformers library, with key training parameters including a learning rate of 3e-5, batch size of 12, and 2 training epochs. It processes sequences up to 384 tokens with a 128 token stride.
- Based on dccuchile/bert-base-spanish-wwm-cased architecture
- Trained on 111K Spanish QA pairs from SQuAD2.0-es-v2.0
- Achieves 76.50% exact match accuracy
- Supports handling of unanswerable questions
Core Capabilities
- Spanish language question answering
- Handles both answerable and unanswerable questions
- Processes cased text maintaining Spanish language nuances
- Provides both exact answers and answer spans
Frequently Asked Questions
Q: What makes this model unique?
This model combines BETO's strong Spanish language understanding with specific optimization for question answering tasks, achieving strong performance on Spanish SQuAD2.0 with an F1 score of 86.07% and handling both answerable and unanswerable questions effectively.
Q: What are the recommended use cases?
The model is ideal for Spanish language applications requiring question answering capabilities, such as customer service automation, information extraction from Spanish documents, and educational tools. It's particularly effective when dealing with scenarios where determining if a question is answerable is crucial.