xlm-roberta-large-squad2

Maintained By
deepset

XLM-RoBERTa Large SQuAD2

PropertyValue
Base ArchitectureXLM-RoBERTa Large
TaskExtractive Question Answering
Training DataSQuAD 2.0
LanguagesMultilingual
Authordeepset
Model URLdeepset/xlm-roberta-large-squad2

What is xlm-roberta-large-squad2?

XLM-RoBERTa Large SQuAD2 is a multilingual question answering model built on the XLM-RoBERTa large architecture and fine-tuned on the SQuAD 2.0 dataset. The model excels at extractive QA tasks across multiple languages, demonstrating impressive performance metrics including 83.79% F1 score on the English SQuAD 2.0 dev set and strong results on German MLQA and XQuAD datasets.

Implementation Details

The model was trained with carefully selected hyperparameters including a batch size of 32, 3 epochs, and a maximum sequence length of 256. It uses a linear warmup learning rate schedule with a warmup proportion of 0.2 and a base learning rate of 1e-5. The training infrastructure utilized 4 Tesla V100 GPUs for optimal performance.

  • Maximum query length: 64 tokens
  • Document stride: 128 tokens
  • Base model: xlm-roberta-large
  • Integration support for both Haystack and Transformers libraries

Core Capabilities

  • Multilingual extractive question answering
  • High performance on English QA (79.46% exact match, 83.79% F1 score)
  • Strong German language support (61.51% exact match on XQuAD)
  • No-answer detection capability
  • Scalable document processing

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful multilingual capabilities of XLM-RoBERTa with sophisticated question answering abilities, making it especially valuable for organizations requiring multilingual QA solutions. Its strong performance across different languages and ability to handle no-answer scenarios makes it particularly versatile.

Q: What are the recommended use cases?

The model is ideal for building multilingual question answering systems, document search applications, and information extraction tools. It's particularly well-suited for applications requiring cross-lingual capabilities and can be efficiently integrated into production systems using frameworks like Haystack.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.