xlm-roberta-mushroom-qa
Property | Value |
---|---|
Author | MichielPronk |
Framework | PyTorch 2.5.1+cu124 |
Task | Hallucination Detection |
Model Base | XLM-RoBERTa |
What is xlm-roberta-mushroom-qa?
xlm-roberta-mushroom-qa is a specialized model fine-tuned for the SemEval 2025 Task3: Mu-SHROOM challenge. Its primary function is to identify and extract hallucination spans from large language model outputs, making it a crucial tool for improving AI system reliability.
Implementation Details
The model is built on the XLM-RoBERTa architecture and fine-tuned using carefully selected hyperparameters. It employs the ADAMW optimizer with betas=(0.9,0.999) and epsilon=1e-08, along with a linear learning rate scheduler. Training was conducted over 4 epochs with a learning rate of 5e-02 and batch sizes of 16 for both training and evaluation.
- Transformers version: 4.48.3
- PyTorch version: 2.5.1+cu124
- Datasets version: 3.3.0
- Tokenizers version: 0.21.0
Core Capabilities
- Hallucination span detection in LLM outputs
- Multi-lingual support through XLM-RoBERTa base architecture
- Optimized for the Mu-SHROOM task specifications
- Efficient processing with moderate batch sizes
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically designed for the novel task of hallucination detection in LLM outputs, making it one of the few models specifically trained for the SemEval 2025 Mu-SHROOM challenge.
Q: What are the recommended use cases?
The model is best suited for detecting and extracting hallucinated content in language model outputs, making it valuable for content verification and AI system evaluation.