BETO Spanish BERT for Question Answering

Property	Value
Author	mrm8488
Downloads	33,181
Task Type	Question Answering
Performance (F1)	86.07%

What is bert-base-spanish-wwm-cased-finetuned-spa-squad2-es?

This is a specialized Spanish language model based on BETO (Spanish BERT) that has been fine-tuned specifically for question-answering tasks using the Spanish version of SQuAD2.0 dataset. The model utilizes whole word masking technique and maintains case sensitivity for enhanced performance in Spanish text understanding.

Implementation Details

The model was trained on a Tesla P100 GPU using the transformers library, with key training parameters including a learning rate of 3e-5, batch size of 12, and 2 training epochs. It processes sequences up to 384 tokens with a 128 token stride.

Based on dccuchile/bert-base-spanish-wwm-cased architecture
Trained on 111K Spanish QA pairs from SQuAD2.0-es-v2.0
Achieves 76.50% exact match accuracy
Supports handling of unanswerable questions

Core Capabilities

Spanish language question answering
Handles both answerable and unanswerable questions
Processes cased text maintaining Spanish language nuances
Provides both exact answers and answer spans

Frequently Asked Questions

Q: What makes this model unique?

This model combines BETO's strong Spanish language understanding with specific optimization for question answering tasks, achieving strong performance on Spanish SQuAD2.0 with an F1 score of 86.07% and handling both answerable and unanswerable questions effectively.

Q: What are the recommended use cases?

The model is ideal for Spanish language applications requiring question answering capabilities, such as customer service automation, information extraction from Spanish documents, and educational tools. It's particularly effective when dealing with scenarios where determining if a question is answerable is crucial.

bert-base-spanish-wwm-cased-finetuned-spa-squad2-es