camembert-ner
Property | Value |
---|---|
Parameter Count | 110M |
License | MIT |
Training Data | wikiner_fr (~170,634 sentences) |
Tensor Type | F32, I64 |
What is camembert-ner?
camembert-ner is a specialized Named Entity Recognition (NER) model fine-tuned from the camemBERT architecture specifically for French language processing. The model demonstrates exceptional performance in identifying and classifying named entities in French text, with particular strength in recognizing entities that don't begin with uppercase letters - a unique characteristic that sets it apart from traditional NER models.
Implementation Details
The model was trained on the wikiner_fr dataset, comprising approximately 170,634 sentences. It utilizes the transformer architecture and is implemented using PyTorch, with support for ONNX and Safetensors. The model categorizes entities into four main classes: Person (PER), Organization (ORG), Location (LOC), and Miscellaneous (MISC).
- Overall F1 Score: 0.8914
- Person Entity Recognition: 0.9483 F1
- Location Entity Recognition: 0.8955 F1
- Organization Entity Recognition: 0.8181 F1
- Miscellaneous Entity Recognition: 0.8146 F1
Core Capabilities
- Robust French named entity recognition
- Superior performance on informal text like emails and chat data
- Effective recognition of non-capitalized entities
- High-accuracy person name detection (94.83% F1 score)
- Seamless integration with HuggingFace's transformers library
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its superior performance on informal text and entities without capitalization, making it particularly suitable for processing emails, chat data, and user-generated content in French.
Q: What are the recommended use cases?
The model is ideal for French text analysis tasks including email processing, chat analysis, document parsing, and general named entity extraction. It's particularly well-suited for applications requiring robust person name detection, given its 94.83% F1 score in this category.