distilcamembert-base

distilcamembert-base

cmarkea

A distilled French language model based on CamemBERT, offering 83% FLUE CLS performance while being lighter and faster than its parent model.

PropertyValue
Authorcmarkea
Model TypeDistilled French Language Model
Training DataOSCAR Dataset (140GB)
Training Duration18 days on Nvidia Titan RTX
PaperDownload PDF

What is distilcamembert-base?

DistilCamemBERT is a compressed version of the CamemBERT model, specifically designed for French language processing. Through knowledge distillation, it maintains impressive performance while significantly reducing computational requirements. The model achieves 83% F1-score on FLUE CLS tasks and remarkable 98% accuracy on French NER tasks.

Implementation Details

The model employs a sophisticated three-part loss function for training: DistilLoss (50%), CosineLoss (30%), and MLMLoss (20%). This combination ensures the student model effectively learns from the teacher while maintaining essential language understanding capabilities.

  • Trained on the OSCAR dataset (same as original CamemBERT)
  • Implements masked language modeling capabilities
  • Optimized for French language understanding tasks
  • Preserves core functionality while reducing model size

Core Capabilities

  • Text Classification (83% FLUE CLS score)
  • Named Entity Recognition (98% accuracy)
  • Cross-lingual Natural Language Inference (77% XNLI score)
  • Masked Language Modeling
  • Semantic Analysis

Frequently Asked Questions

Q: What makes this model unique?

DistilCamemBERT stands out for being a highly efficient, distilled version of CamemBERT that maintains strong performance while requiring fewer computational resources. It's specifically optimized for French language tasks and achieves near-original model performance levels.

Q: What are the recommended use cases?

The model is ideal for French language processing tasks including text classification, named entity recognition, and masked language modeling. It's particularly suitable for applications where computational efficiency is important while maintaining high accuracy in French language understanding.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026