DistilCamemBERT-NLI

Property	Value
Author	cmarkea
Task	Natural Language Inference (French)
Base Model	DistilCamemBERT
Paper	Link to Paper
Average Inference Time	51.35ms

What is distilcamembert-base-nli?

DistilCamemBERT-NLI is a specialized French language model designed for Natural Language Inference (NLI) tasks. It's built upon DistilCamemBERT and fine-tuned on the XNLI dataset, offering significantly faster inference times compared to its CamemBERT counterpart while maintaining robust performance. The model achieves 77.45% accuracy on NLI tasks with half the inference time of CamemBERT-based alternatives.

Implementation Details

The model was trained on the XNLI dataset comprising 392,702 premise-hypothesis pairs for training and 5,010 pairs for testing. It's specifically optimized for determining whether a premise entails, contradicts, or is neutral to a hypothesis, making it particularly useful for zero-shot classification tasks.

Global F1-score: 77.45%
Contradiction detection: 79.54% F1-score
Entailment detection: 78.87% F1-score
Neutral detection: 74.04% F1-score

Core Capabilities

Zero-shot classification for French text
Efficient inference with average 51.35ms processing time
Sentiment analysis capability (80.59% accuracy on Allocine dataset)
Topic classification (79.30% accuracy on mlsum dataset)
ONNX runtime support for optimization

Frequently Asked Questions

Q: What makes this model unique?

The model's primary strength lies in its efficiency-to-performance ratio. While similar models like CamemBERT-base-xnli and mDeBERTa-v3 might offer slightly higher accuracy, DistilCamemBERT-NLI provides significantly faster inference times (51.35ms vs 105.0ms for CamemBERT), making it ideal for production environments where speed is crucial.

Q: What are the recommended use cases?

The model excels in French language tasks including: text classification, sentiment analysis, topic categorization, and natural language inference. It's particularly valuable for zero-shot classification scenarios where training data isn't available, and in production environments where quick inference times are essential.

distilcamembert-base-nli