roberta-large-squad2

Maintained By
deepset

roberta-large-squad2

PropertyValue
Parameter Count354M
LicenseCC-BY-4.0
Base ModelRoBERTa-large
TaskExtractive Question Answering
Training DataSQuAD 2.0

What is roberta-large-squad2?

roberta-large-squad2 is a specialized question-answering model built on the RoBERTa-large architecture and fine-tuned on the SQuAD2.0 dataset. This model excels at extractive QA tasks, achieving an impressive 85.168% exact match score and 88.349% F1 score on the SQuAD2.0 validation set.

Implementation Details

The model is implemented using the RoBERTa-large architecture and trained on 4x Tesla v100 GPUs. It supports both answerable and unanswerable questions, making it robust for real-world applications. The model can be easily integrated using either the Haystack framework or the Transformers library.

  • Based on the robust RoBERTa-large architecture
  • Fine-tuned specifically for extractive question answering
  • Handles both answerable and unanswerable questions
  • Optimized for production deployment

Core Capabilities

  • Achieves 85.168% exact match on SQuAD2.0
  • Performs well across different domains (NYT: 84.352% EM, New Wiki: 82.338% EM)
  • Supports seamless integration with Haystack and Transformers pipelines
  • Handles complex question-answering scenarios

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its robust performance across various domains and its ability to handle unanswerable questions. It's particularly notable for achieving high scores on both SQuAD2.0 and domain-shifted datasets.

Q: What are the recommended use cases?

The model is ideal for extractive question answering tasks in production environments, particularly when accuracy is crucial. It's well-suited for applications in document analysis, information extraction, and automated question answering systems.