ModernBERT-large-zeroshot-v2.0

Maintained By
MoritzLaurer

ModernBERT-large-zeroshot-v2.0

PropertyValue
AuthorMoritzLaurer
Model TypeZero-shot Text Classification
Base ArchitectureModernBERT-large
Average Accuracy85%
Model LinkHugging Face

What is ModernBERT-large-zeroshot-v2.0?

ModernBERT-large-zeroshot-v2.0 is an advanced language model specifically fine-tuned for zero-shot classification tasks. Built on answerdotai's ModernBERT-large architecture, this model stands out for its exceptional speed and memory efficiency compared to alternatives like DeBERTav3. It features an impressive 8k context window and achieves strong performance across various classification tasks.

Implementation Details

The model was trained using carefully selected hyperparameters, including a learning rate of 9e-06, batch size of 32, and linear learning rate scheduling with a 6% warmup ratio. Training was conducted over 2 epochs using the AdamW optimizer. Notable technical achievements include bf16 precision support, which delivers approximately 2x speed improvement over fp16.

  • Processes 1116 texts per second on an A100 40GB GPU
  • Achieves 85% mean accuracy across diverse tasks
  • Excels in sentiment analysis with 96.4% accuracy on Amazon Polarity
  • Superior performance on NLI tasks (94.2% on MNLI matched)

Core Capabilities

  • Zero-shot text classification across multiple domains
  • Efficient handling of long text sequences (8k context window)
  • High-speed inference with optimal memory usage
  • Robust performance on sentiment analysis, topic classification, and hate speech detection

Frequently Asked Questions

Q: What makes this model unique?

This model combines exceptional speed and memory efficiency with strong classification performance. It processes text multiple times faster than DeBERTav3 while maintaining competitive accuracy levels.

Q: What are the recommended use cases?

The model is ideal for zero-shot classification tasks, particularly in scenarios requiring high-speed processing of large text volumes. It performs especially well in sentiment analysis, topic classification, and natural language inference tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.