DistilBART-MNLI-12-1

Property	Value
Author	valhalla
Task Type	Zero-Shot Classification
Framework	PyTorch
Downloads	26,869

What is distilbart-mnli-12-1?

DistilBART-MNLI-12-1 is a distilled version of the BART-large-MNLI model, created using the No Teacher Distillation technique. It maintains 12 encoder layers but reduces to just 1 decoder layer, offering an efficient alternative while preserving strong performance. The model achieves 87.08% accuracy on matched and 87.5% on mismatched MNLI datasets, compared to the original model's 89.9% and 90.01% respectively.

Implementation Details

The model implements a novel distillation approach where alternating layers from bart-large-mnli are copied and fine-tuned on the same data. This technique provides a remarkable balance between model efficiency and performance, demonstrating only a minimal drop in accuracy despite significant architecture reduction.

12 encoder layers with optimized architecture
Single decoder layer for efficient processing
Built on PyTorch framework
Specialized for zero-shot classification tasks

Core Capabilities

Zero-shot text classification
Natural language inference tasks
Efficient processing with reduced parameter count
Maintains strong performance metrics despite compression

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its successful application of No Teacher Distillation, achieving near-baseline performance while significantly reducing the model architecture. It represents an excellent balance between efficiency and accuracy.

Q: What are the recommended use cases?

The model is ideal for zero-shot classification tasks where efficient processing is required without significantly compromising accuracy. It's particularly suitable for production environments where resource optimization is crucial.