mDeBERTa-v3-base-finetuned-nli-jnli
Property | Value |
---|---|
License | MIT |
Base Model | microsoft/mdeberta-v3-base |
Primary Language | Japanese |
Downloads | 47,852 |
Performance (F1) | 67.42% |
What is mDeBERTa-v3-base-finetuned-nli-jnli?
This is a specialized Japanese language model fine-tuned for Natural Language Inference (NLI) and zero-shot classification tasks. Built upon Microsoft's mDeBERTa-v3-base architecture, it has been specifically optimized using the JGLUE dataset and multilingual NLI data spanning 26 languages.
Implementation Details
The model was trained using a carefully calibrated process with a learning rate of 3e-05, utilizing the Adam optimizer with betas=(0.9,0.999). Training was conducted over 2 epochs with a linear learning rate scheduler and a 6% warmup ratio. The model achieved a final validation accuracy of 68.08% and an F1 score of 67.42%.
- Batch size: 8 for both training and evaluation
- Optimizer: Adam with epsilon=1e-08
- Training framework: PyTorch 2.0.1 with Transformers 4.33.2
- Specialized for zero-shot classification and NLI tasks
Core Capabilities
- Zero-shot text classification for Japanese content
- Natural Language Inference tasks
- Multi-language support with Japanese optimization
- Flexible classification with customizable label sets
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized Japanese language capabilities while maintaining multilingual support, making it particularly effective for Japanese NLI and zero-shot classification tasks. Its fine-tuning on both JGLUE and multilingual NLI datasets provides robust performance for Japanese language understanding.
Q: What are the recommended use cases?
The model excels in zero-shot classification tasks for Japanese text, such as intent classification, topic categorization, and natural language inference. It's particularly well-suited for applications requiring classification without extensive labeled training data, as demonstrated in the example use cases for weather, news, finance, and schedule-related queries.