DeBERTa-v3-base-tasksource-nli
Property | Value |
---|---|
Parameter Count | 184M |
License | Apache 2.0 |
Paper | arXiv:2301.05948 |
Architecture | DeBERTa-v3-base |
What is deberta-v3-base-tasksource-nli?
This is a powerful language model based on DeBERTa-v3-base architecture, fine-tuned through multi-task learning on over 600 NLP tasks. It excels in zero-shot classification and natural language inference (NLI) tasks, achieving impressive results like 70% accuracy on WNLI.
Implementation Details
The model underwent extensive training for 200,000 steps with a batch size of 384 and a peak learning rate of 2e-5. Training was conducted on an Nvidia A30 24GB GPU over 15 days. It implements task-specific CLS embeddings with a 10% dropout rate to ensure flexibility in usage.
- Multi-task learning across 600+ tasks
- Shared classification layers for multiple-choice tasks
- Optimized weight sharing for matching labels
- Specialized NLI dataset integration for improved zero-shot performance
Core Capabilities
- Zero-shot classification with arbitrary labels
- Natural Language Inference (NLI) tasks
- Support for hundreds of tasks via tasksource-adapters
- Fine-tuning capabilities for new tasks
- Token classification and multiple-choice task handling
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its extensive multi-task training across 600+ tasks and ranked first among all models with the DeBERTa-v3-base architecture in IBM's model recycling evaluation. It offers exceptional flexibility through its zero-shot capabilities and tasksource-adapters integration.
Q: What are the recommended use cases?
The model is ideal for zero-shot classification tasks, natural language inference, and can be efficiently adapted for specific classification tasks through fine-tuning. It's particularly strong in scenarios requiring understanding of textual entailment and semantic relationships.