ModernBERT-large-zeroshot-v2.0
Property | Value |
---|---|
Author | MoritzLaurer |
Model Type | Zero-shot Text Classification |
Base Architecture | ModernBERT-large |
Average Accuracy | 85% |
Model Link | Hugging Face |
What is ModernBERT-large-zeroshot-v2.0?
ModernBERT-large-zeroshot-v2.0 is an advanced language model specifically fine-tuned for zero-shot classification tasks. Built on answerdotai's ModernBERT-large architecture, this model stands out for its exceptional speed and memory efficiency compared to alternatives like DeBERTav3. It features an impressive 8k context window and achieves strong performance across various classification tasks.
Implementation Details
The model was trained using carefully selected hyperparameters, including a learning rate of 9e-06, batch size of 32, and linear learning rate scheduling with a 6% warmup ratio. Training was conducted over 2 epochs using the AdamW optimizer. Notable technical achievements include bf16 precision support, which delivers approximately 2x speed improvement over fp16.
- Processes 1116 texts per second on an A100 40GB GPU
- Achieves 85% mean accuracy across diverse tasks
- Excels in sentiment analysis with 96.4% accuracy on Amazon Polarity
- Superior performance on NLI tasks (94.2% on MNLI matched)
Core Capabilities
- Zero-shot text classification across multiple domains
- Efficient handling of long text sequences (8k context window)
- High-speed inference with optimal memory usage
- Robust performance on sentiment analysis, topic classification, and hate speech detection
Frequently Asked Questions
Q: What makes this model unique?
This model combines exceptional speed and memory efficiency with strong classification performance. It processes text multiple times faster than DeBERTav3 while maintaining competitive accuracy levels.
Q: What are the recommended use cases?
The model is ideal for zero-shot classification tasks, particularly in scenarios requiring high-speed processing of large text volumes. It performs especially well in sentiment analysis, topic classification, and natural language inference tasks.