DistilCamemBERT-NLI
Property | Value |
---|---|
Author | cmarkea |
Task | Natural Language Inference (French) |
Base Model | DistilCamemBERT |
Paper | Link to Paper |
Average Inference Time | 51.35ms |
What is distilcamembert-base-nli?
DistilCamemBERT-NLI is a specialized French language model designed for Natural Language Inference (NLI) tasks. It's built upon DistilCamemBERT and fine-tuned on the XNLI dataset, offering significantly faster inference times compared to its CamemBERT counterpart while maintaining robust performance. The model achieves 77.45% accuracy on NLI tasks with half the inference time of CamemBERT-based alternatives.
Implementation Details
The model was trained on the XNLI dataset comprising 392,702 premise-hypothesis pairs for training and 5,010 pairs for testing. It's specifically optimized for determining whether a premise entails, contradicts, or is neutral to a hypothesis, making it particularly useful for zero-shot classification tasks.
- Global F1-score: 77.45%
- Contradiction detection: 79.54% F1-score
- Entailment detection: 78.87% F1-score
- Neutral detection: 74.04% F1-score
Core Capabilities
- Zero-shot classification for French text
- Efficient inference with average 51.35ms processing time
- Sentiment analysis capability (80.59% accuracy on Allocine dataset)
- Topic classification (79.30% accuracy on mlsum dataset)
- ONNX runtime support for optimization
Frequently Asked Questions
Q: What makes this model unique?
The model's primary strength lies in its efficiency-to-performance ratio. While similar models like CamemBERT-base-xnli and mDeBERTa-v3 might offer slightly higher accuracy, DistilCamemBERT-NLI provides significantly faster inference times (51.35ms vs 105.0ms for CamemBERT), making it ideal for production environments where speed is crucial.
Q: What are the recommended use cases?
The model excels in French language tasks including: text classification, sentiment analysis, topic categorization, and natural language inference. It's particularly valuable for zero-shot classification scenarios where training data isn't available, and in production environments where quick inference times are essential.