xlm-roberta-base-finetuned-ner-yoruba

xlm-roberta-base-finetuned-ner-yoruba

mbeukman

NER model fine-tuned on Yoruba language data, achieving 78.22% F1 score. Based on XLM-RoBERTa, specialized for African language NER tasks.

PropertyValue
Authormbeukman
LicenseApache License 2.0
TaskNamed Entity Recognition (NER)
Base Modelxlm-roberta-base
F1 Score78.22%

What is xlm-roberta-base-finetuned-ner-yoruba?

This is a specialized Named Entity Recognition (NER) model fine-tuned on the MasakhaNER dataset, specifically for the Yoruba language. Built upon the XLM-RoBERTa base model, it's designed to identify and classify named entities in Yoruba text, including persons, organizations, locations, and dates. The model was trained using significant computational resources, including an NVIDIA RTX3090 GPU, and achieved impressive performance metrics across different entity categories.

Implementation Details

The model was fine-tuned for 50 epochs using carefully selected hyperparameters: maximum sequence length of 200, batch size of 32, and learning rate of 5e-5. Training was repeated across 5 different random seeds to ensure robustness, with this version representing the best-performing iteration.

  • Training Time: 10-30 minutes per iteration
  • GPU Memory Required: 14GB (optimal), 6.5GB (minimum with batch size 1)
  • Architecture: XLM-RoBERTa with token classification head
  • Dataset: MasakhaNER Yoruba subset

Core Capabilities

  • Entity Detection: Persons (F1: 82%), Locations (F1: 80%), Organizations (F1: 71%), Dates (F1: 77%)
  • Token Classification: Support for 9 different label types including B-/I- prefixes
  • Multilingual Foundation: Built on XLM-RoBERTa's multilingual capabilities
  • African Language Support: Specialized for Yoruba text processing

Frequently Asked Questions

Q: What makes this model unique?

This model is part of the first large-scale effort to create high-quality NER models for African languages, specifically optimized for Yoruba. It demonstrates strong performance while addressing the critical need for NLP tools in under-resourced languages.

Q: What are the recommended use cases?

The model is primarily intended for NLP research purposes, including interpretability studies and transfer learning experiments. It's not recommended for production use due to potential limitations in generalizability and performance across different domains.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026