xlm-roberta-base-finetuned-ner-yoruba

Maintained By
mbeukman

xlm-roberta-base-finetuned-ner-yoruba

PropertyValue
Authormbeukman
LicenseApache License 2.0
TaskNamed Entity Recognition (NER)
Base Modelxlm-roberta-base
F1 Score78.22%

What is xlm-roberta-base-finetuned-ner-yoruba?

This is a specialized Named Entity Recognition (NER) model fine-tuned on the MasakhaNER dataset, specifically for the Yoruba language. Built upon the XLM-RoBERTa base model, it's designed to identify and classify named entities in Yoruba text, including persons, organizations, locations, and dates. The model was trained using significant computational resources, including an NVIDIA RTX3090 GPU, and achieved impressive performance metrics across different entity categories.

Implementation Details

The model was fine-tuned for 50 epochs using carefully selected hyperparameters: maximum sequence length of 200, batch size of 32, and learning rate of 5e-5. Training was repeated across 5 different random seeds to ensure robustness, with this version representing the best-performing iteration.

  • Training Time: 10-30 minutes per iteration
  • GPU Memory Required: 14GB (optimal), 6.5GB (minimum with batch size 1)
  • Architecture: XLM-RoBERTa with token classification head
  • Dataset: MasakhaNER Yoruba subset

Core Capabilities

  • Entity Detection: Persons (F1: 82%), Locations (F1: 80%), Organizations (F1: 71%), Dates (F1: 77%)
  • Token Classification: Support for 9 different label types including B-/I- prefixes
  • Multilingual Foundation: Built on XLM-RoBERTa's multilingual capabilities
  • African Language Support: Specialized for Yoruba text processing

Frequently Asked Questions

Q: What makes this model unique?

This model is part of the first large-scale effort to create high-quality NER models for African languages, specifically optimized for Yoruba. It demonstrates strong performance while addressing the critical need for NLP tools in under-resourced languages.

Q: What are the recommended use cases?

The model is primarily intended for NLP research purposes, including interpretability studies and transfer learning experiments. It's not recommended for production use due to potential limitations in generalizability and performance across different domains.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.