afro-xlmr-large
Property | Value |
---|---|
Author | Davlan |
Model Type | Multilingual Language Model |
Paper | COLING 2022 |
HuggingFace | Model Repository |
What is afro-xlmr-large?
afro-xlmr-large is an advanced multilingual language model specifically adapted for African languages. It was created by performing multilingual adaptive fine-tuning (MAFT) of the XLM-R-large model on 17 African languages and 3 high-resource languages commonly used in Africa. The model demonstrates significant improvements over baseline models, particularly in Named Entity Recognition (NER) tasks.
Implementation Details
The model was developed using an innovative approach that includes vocabulary optimization and multilingual adaptive fine-tuning. Key technical aspects include removal of non-African writing script tokens from the embedding layer, reducing model size by approximately 50% while maintaining performance.
- Supports 17 African languages including Afrikaans, Amharic, Hausa, Igbo, and more
- Includes support for Arabic, French, and English
- Achieves 83.9% average F-score on MasakhaNER evaluation
- Optimized vocabulary focusing on African language scripts
Core Capabilities
- Named Entity Recognition with state-of-the-art performance
- Cross-lingual transfer learning
- Efficient multilingual processing
- Reduced model size while maintaining performance
- Zero-shot cross-lingual transfer capabilities
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness lies in its specialized focus on African languages and its efficient multilingual adaptation approach. By removing non-African script tokens and using MAFT, it achieves superior performance while maintaining a smaller footprint compared to individual language models.
Q: What are the recommended use cases?
The model is particularly well-suited for NLP tasks in African languages, including Named Entity Recognition, news topic classification, and sentiment analysis. It's especially valuable for applications requiring cross-lingual transfer learning between African languages.