mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
Property | Value |
---|---|
Parameter Count | 279M |
License | MIT |
Languages Supported | 100+ |
Training Data | 2.7M+ text pairs |
What is mDeBERTa-v3-base-xnli-multilingual-nli-2mil7?
This is a powerful multilingual natural language inference (NLI) model built on Microsoft's mDeBERTa-v3 architecture. The model was pre-trained on the CC100 multilingual dataset covering 100 languages and fine-tuned on over 2.7 million hypothesis-premise pairs across 27 languages. It excels at both NLI tasks and zero-shot classification in multiple languages.
Implementation Details
The model leverages the mDeBERTa-v3-base architecture and was trained using a specialized dataset combining XNLI and multilingual-NLI-26lang-2mil7. It achieves impressive accuracy scores, including 87.1% on English MNLI and maintaining strong performance across multiple languages, typically above 80% accuracy.
- Pre-trained on CC100 dataset covering 100 languages
- Fine-tuned on 2.7M+ hypothesis-premise pairs
- Supports zero-shot classification across languages
- Optimized for both mono- and cross-lingual tasks
Core Capabilities
- Natural Language Inference across 100+ languages
- Zero-shot classification in multiple languages
- Cross-lingual transfer learning
- High performance on standard NLI benchmarks
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to perform NLI tasks across 100 languages and its extensive training on 2.7M+ text pairs makes it particularly robust for multilingual applications. It's also one of the few models explicitly tested for cross-lingual transfer capabilities.
Q: What are the recommended use cases?
The model is ideal for multilingual zero-shot classification, natural language inference tasks, and cross-lingual applications where you need to compare text meaning across different languages. It's particularly strong in academic and research contexts requiring multilingual text analysis.