distilbart-mnli-12-6

Maintained By
valhalla

DistilBART-MNLI-12-6

PropertyValue
Authorvalhalla
Task TypeZero-Shot Classification
FrameworkPyTorch, JAX
Training DataMNLI Dataset

What is distilbart-mnli-12-6?

DistilBART-MNLI-12-6 is a distilled version of the BART-large-MNLI model, created using the No Teacher Distillation technique. It features 12 encoder layers and 6 decoder layers, achieving impressive performance of 89.19% matched accuracy and 89.01% mismatched accuracy on the MNLI dataset, while being more efficient than its parent model.

Implementation Details

The model employs a unique distillation approach where alternating layers from BART-large-MNLI are copied and then fine-tuned on the same data. This technique provides an excellent balance between model size and performance, with only a minimal drop in accuracy compared to the original model.

  • 12 encoder layers and 6 decoder layers architecture
  • Trained on MNLI dataset
  • Implements zero-shot classification capability
  • Achieves near-original model performance with fewer parameters

Core Capabilities

  • Zero-shot text classification
  • Natural language inference tasks
  • Efficient inference with reduced model size
  • Cross-framework compatibility (PyTorch and JAX)

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its successful implementation of the No Teacher Distillation technique, achieving near-original performance while using fewer parameters. It maintains 89.19% matched accuracy compared to the original model's 89.9%, making it an efficient alternative for production deployments.

Q: What are the recommended use cases?

The model is particularly well-suited for zero-shot classification tasks where efficient inference is required. It's ideal for production environments where resource constraints exist but high performance is still necessary for natural language inference tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.