bert-base-multilingual-cased-finetuned-yoruba

Maintained By
Davlan

bert-base-multilingual-cased-finetuned-yoruba

PropertyValue
AuthorDavlan
Base Modelbert-base-multilingual-cased
LanguageYoruba
Training HardwareNVIDIA V100 GPU

What is bert-base-multilingual-cased-finetuned-yoruba?

This is a specialized BERT model fine-tuned specifically for the Yoruba language, built upon the bert-base-multilingual-cased architecture. It represents a significant advancement in African language processing, offering enhanced performance for Yoruba text analysis tasks compared to the standard multilingual BERT model.

Implementation Details

The model was trained on a diverse dataset including Bible texts, JW300, Menyo-20k, Yoruba Embedding corpus, CC-Aligned, Wikipedia, and various news sources including BBC Yoruba, VON Yoruba, Asejere, and Alaroye. Training was conducted on a single NVIDIA V100 GPU, focusing on optimizing performance for Yoruba language understanding.

  • Achieves 82.58% F1 score on MasakhaNER (improvement over mBERT's 78.97%)
  • Performs at 79.11% F1 score on BBC Yorùbá Text Classification (better than mBERT's 75.13%)
  • Supports masked token prediction through the Transformers pipeline

Core Capabilities

  • Named Entity Recognition in Yoruba text
  • Text Classification tasks
  • Masked Language Modeling
  • Context-aware token prediction

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Yoruba language processing, offering superior performance compared to general multilingual models. It's trained on a comprehensive collection of Yoruba texts from various sources, making it particularly effective for real-world applications.

Q: What are the recommended use cases?

The model is ideal for Named Entity Recognition, text classification, and general Yoruba language understanding tasks. It's particularly suitable for processing news content, religious texts, and general Yoruba language documents.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.