XLM-RoBERTa Large SQuAD2

Property	Value
Base Architecture	XLM-RoBERTa Large
Task	Extractive Question Answering
Training Data	SQuAD 2.0
Languages	Multilingual
Author	deepset
Model URL	deepset/xlm-roberta-large-squad2

What is xlm-roberta-large-squad2?

XLM-RoBERTa Large SQuAD2 is a multilingual question answering model built on the XLM-RoBERTa large architecture and fine-tuned on the SQuAD 2.0 dataset. The model excels at extractive QA tasks across multiple languages, demonstrating impressive performance metrics including 83.79% F1 score on the English SQuAD 2.0 dev set and strong results on German MLQA and XQuAD datasets.

Implementation Details

The model was trained with carefully selected hyperparameters including a batch size of 32, 3 epochs, and a maximum sequence length of 256. It uses a linear warmup learning rate schedule with a warmup proportion of 0.2 and a base learning rate of 1e-5. The training infrastructure utilized 4 Tesla V100 GPUs for optimal performance.

Maximum query length: 64 tokens
Document stride: 128 tokens
Base model: xlm-roberta-large
Integration support for both Haystack and Transformers libraries

Core Capabilities

Multilingual extractive question answering
High performance on English QA (79.46% exact match, 83.79% F1 score)
Strong German language support (61.51% exact match on XQuAD)
No-answer detection capability
Scalable document processing

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful multilingual capabilities of XLM-RoBERTa with sophisticated question answering abilities, making it especially valuable for organizations requiring multilingual QA solutions. Its strong performance across different languages and ability to handle no-answer scenarios makes it particularly versatile.

Q: What are the recommended use cases?

The model is ideal for building multilingual question answering systems, document search applications, and information extraction tools. It's particularly well-suited for applications requiring cross-lingual capabilities and can be efficiently integrated into production systems using frameworks like Haystack.