xlm-roberta-mushroom-qa

Property	Value
Author	MichielPronk
Framework	PyTorch 2.5.1+cu124
Task	Hallucination Detection
Model Base	XLM-RoBERTa

What is xlm-roberta-mushroom-qa?

xlm-roberta-mushroom-qa is a specialized model fine-tuned for the SemEval 2025 Task3: Mu-SHROOM challenge. Its primary function is to identify and extract hallucination spans from large language model outputs, making it a crucial tool for improving AI system reliability.

Implementation Details

The model is built on the XLM-RoBERTa architecture and fine-tuned using carefully selected hyperparameters. It employs the ADAMW optimizer with betas=(0.9,0.999) and epsilon=1e-08, along with a linear learning rate scheduler. Training was conducted over 4 epochs with a learning rate of 5e-02 and batch sizes of 16 for both training and evaluation.

Transformers version: 4.48.3
PyTorch version: 2.5.1+cu124
Datasets version: 3.3.0
Tokenizers version: 0.21.0

Core Capabilities

Hallucination span detection in LLM outputs
Multi-lingual support through XLM-RoBERTa base architecture
Optimized for the Mu-SHROOM task specifications
Efficient processing with moderate batch sizes

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for the novel task of hallucination detection in LLM outputs, making it one of the few models specifically trained for the SemEval 2025 Mu-SHROOM challenge.

Q: What are the recommended use cases?

The model is best suited for detecting and extracting hallucinated content in language model outputs, making it valuable for content verification and AI system evaluation.