afro-xlmr-small
Property | Value |
---|---|
Author | Davlan |
Model Type | Multilingual Language Model |
Vocabulary Size | 70,000 tokens |
Paper | COLING 2022 |
Model URL | Hugging Face |
What is afro-xlmr-small?
afro-xlmr-small is a specialized multilingual language model specifically adapted for African languages. It was created by reducing XLM-R-base's vocabulary from 250K to 70K tokens and then performing multilingual adaptive fine-tuning on 17 African languages plus Arabic, French, and English. The model demonstrates impressive performance on various NLP tasks while maintaining a smaller footprint than its base version.
Implementation Details
The model implements a novel approach to multilingual language modeling by focusing specifically on African languages. It uses multilingual adaptive fine-tuning (MAFT) and removes vocabulary tokens corresponding to non-African writing scripts, resulting in a 50% reduction in model size while maintaining competitive performance.
- Supports 17 African languages including Afrikaans, Amharic, Hausa, Igbo, and more
- Reduced vocabulary size of 70K tokens for efficiency
- Optimized for cross-lingual transfer learning
- Shows competitive performance on NER tasks compared to larger models
Core Capabilities
- Named Entity Recognition (NER) with strong performance across multiple African languages
- News topic classification
- Sentiment classification
- Zero-shot cross-lingual transfer
- Parameter efficient fine-tuning
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness lies in its specialized focus on African languages and its efficient architecture. By reducing vocabulary size and removing non-African scripts, it achieves comparable performance to larger models while being more resource-efficient. It shows particularly strong results in Hausa (91.4 F1) and Nigerian Pidgin (89.0 F1) for NER tasks.
Q: What are the recommended use cases?
The model is particularly well-suited for NLP tasks involving African languages, especially NER, topic classification, and sentiment analysis. It's ideal for applications requiring multilingual African language processing with limited computational resources.