DeBERTa-v3-xsmall-mnli-fever-anli-ling-binary

DeBERTa-v3-xsmall-mnli-fever-anli-ling-binary

MoritzLaurer

Efficient binary NLI model (70.8M params) trained on 782K pairs from 4 datasets, optimized for zero-shot classification using DeBERTa-v3 architecture

PropertyValue
Parameter Count70.8M
LicenseMIT
PaperDeBERTa-V3 Paper
Training Data782,357 hypothesis-premise pairs
Accuracy (MNLI-m)92.5%

What is DeBERTa-v3-xsmall-mnli-fever-anli-ling-binary?

This is a specialized natural language inference (NLI) model built on Microsoft's DeBERTa-v3-xsmall architecture, specifically optimized for binary classification tasks. It's trained to determine whether a hypothesis is entailed by a premise or not, making it particularly effective for zero-shot classification scenarios.

Implementation Details

The model leverages advanced training on four major NLI datasets (MultiNLI, Fever-NLI, LingNLI, and ANLI), utilizing mixed precision training with carefully tuned hyperparameters including a learning rate of 2e-05 and weight decay of 0.06. The implementation features efficient batch processing and warmup optimization.

  • Binary classification focus (entailment/non-entailment)
  • Optimized for zero-shot applications
  • Mixed precision training (FP16)
  • Efficient batch processing (32 samples per device)

Core Capabilities

  • High accuracy on MNLI matched (92.5%) and mismatched (92.2%) sets
  • Efficient processing speed (473 texts/sec on GPU)
  • Robust performance across multiple NLI datasets
  • Optimized for resource-efficient deployment

Frequently Asked Questions

Q: What makes this model unique?

The model's binary classification approach and specific optimization for zero-shot classification, combined with its efficient architecture (70.8M parameters), makes it particularly suitable for practical applications where distinguishing between neutral and contradiction isn't necessary.

Q: What are the recommended use cases?

This model is ideal for zero-shot classification tasks, text entailment verification, and applications requiring binary decision-making about text relationships. It's particularly useful when computational efficiency is important while maintaining high accuracy.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026