distilbart-mnli-12-3

distilbart-mnli-12-3

valhalla

DistilBART model fine-tuned on MNLI task, achieving 88.1% matched accuracy through knowledge distillation, optimized for zero-shot classification.

PropertyValue
Authorvalhalla
Downloads22,963
TaskZero-Shot Classification
FrameworkPyTorch, JAX

What is distilbart-mnli-12-3?

DistilBART-MNLI-12-3 is a distilled version of the BART-large-MNLI model, created using the No Teacher Distillation technique. It maintains 12 encoder layers but reduces the decoder layers to 3, achieving an impressive 88.1% matched accuracy and 88.19% mismatched accuracy on the MNLI dataset, while being more efficient than its parent model.

Implementation Details

The model is implemented using a novel distillation approach where alternating layers from BART-large-MNLI are copied and then fine-tuned on the same data. This technique provides an excellent balance between model performance and efficiency.

  • Uses 12 encoder layers and 3 decoder layers
  • Achieves near-parent model performance with reduced parameters
  • Implemented in both PyTorch and JAX frameworks

Core Capabilities

  • Zero-shot text classification
  • Natural language inference tasks
  • Efficient processing with reduced parameter count
  • Maintains 88.1% matched accuracy on MNLI dataset

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient distillation technique that maintains high performance while reducing model size. It achieves only a 1.8% drop in accuracy compared to the original BART-large-MNLI while being significantly more efficient.

Q: What are the recommended use cases?

The model is ideal for zero-shot classification tasks, especially when computational efficiency is important. It's particularly well-suited for natural language inference and text classification scenarios where full BART-large-MNLI might be computationally expensive.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026