CamemBERT-large
Property | Value |
---|---|
Parameter Count | 337M parameters |
Model Type | Transformer-based Language Model |
Training Data | CCNet (135GB of text) |
Paper | Research Paper |
Tensor Type | F32 |
What is camembert-large?
CamemBERT-large is a state-of-the-art French language model based on the RoBERTa architecture. As the larger variant of the CamemBERT family, it contains 337M parameters and was trained on an extensive 135GB dataset from CCNet. This model represents a significant advancement in French natural language processing, offering superior performance for various NLP tasks.
Implementation Details
The model is implemented using the Transformers library and PyTorch backend, featuring a large architecture configuration with advanced tokenization capabilities through SentencePiece. It supports both inference endpoints and Safetensors, making it versatile for production deployments.
- Large-scale architecture with 337M parameters for enhanced modeling capacity
- Trained on CCNet corpus (135GB) for comprehensive French language understanding
- Built on RoBERTa architecture with optimizations for French language processing
- Supports masked language modeling and feature extraction
Core Capabilities
- Contextual word embeddings generation
- Masked language modeling for text completion
- Feature extraction from all 24 attention layers
- Support for both inference and fine-tuning workflows
Frequently Asked Questions
Q: What makes this model unique?
CamemBERT-large is distinguished by its large parameter count (337M) and extensive training on French-specific data, making it particularly effective for French language tasks. It offers state-of-the-art performance while maintaining compatibility with the Hugging Face ecosystem.
Q: What are the recommended use cases?
The model excels at various French NLP tasks, including text classification, named entity recognition, and masked language modeling. It's particularly suitable for applications requiring deep language understanding in French, such as content analysis, text generation, and semantic search.