CamemBERT-large

Property	Value
Parameter Count	337M parameters
Model Type	Transformer-based Language Model
Training Data	CCNet (135GB of text)
Paper	Research Paper
Tensor Type	F32

What is camembert-large?

CamemBERT-large is a state-of-the-art French language model based on the RoBERTa architecture. As the larger variant of the CamemBERT family, it contains 337M parameters and was trained on an extensive 135GB dataset from CCNet. This model represents a significant advancement in French natural language processing, offering superior performance for various NLP tasks.

Implementation Details

The model is implemented using the Transformers library and PyTorch backend, featuring a large architecture configuration with advanced tokenization capabilities through SentencePiece. It supports both inference endpoints and Safetensors, making it versatile for production deployments.

Large-scale architecture with 337M parameters for enhanced modeling capacity
Trained on CCNet corpus (135GB) for comprehensive French language understanding
Built on RoBERTa architecture with optimizations for French language processing
Supports masked language modeling and feature extraction

Core Capabilities

Contextual word embeddings generation
Masked language modeling for text completion
Feature extraction from all 24 attention layers
Support for both inference and fine-tuning workflows

Frequently Asked Questions

Q: What makes this model unique?

CamemBERT-large is distinguished by its large parameter count (337M) and extensive training on French-specific data, making it particularly effective for French language tasks. It offers state-of-the-art performance while maintaining compatibility with the Hugging Face ecosystem.

Q: What are the recommended use cases?

The model excels at various French NLP tasks, including text classification, named entity recognition, and masked language modeling. It's particularly suitable for applications requiring deep language understanding in French, such as content analysis, text generation, and semantic search.

camembert-large