ner-german-large
Property | Value |
---|---|
Framework | PyTorch / Flair |
Task | Named Entity Recognition |
Performance | 92.31% F1-Score (CoNLL-03 German) |
Paper | FLERT: Document-Level Features for NER |
Downloads | 183,295 |
What is ner-german-large?
ner-german-large is a state-of-the-art Named Entity Recognition model specifically designed for German text. Built using the FLERT architecture and document-level XLM-R embeddings, this model excels at identifying four types of entities: person names (PER), locations (LOC), organizations (ORG), and miscellaneous names (MISC).
Implementation Details
The model leverages document-context-aware transformer embeddings based on XLM-RoBERTa large. It employs a streamlined architecture without CRF or RNN layers, focusing on direct fine-tuning of transformer embeddings for optimal performance.
- Uses XLM-RoBERTa large as base model with document context
- Implements FLERT architecture for enhanced NER performance
- Trained with AdamW optimizer and OneCycleLR scheduler
- Hidden size of 256 dimensions
Core Capabilities
- High-accuracy entity detection with 92.31% F1-score
- Four-class entity classification (PER, LOC, ORG, MISC)
- Handles complex German text with context awareness
- Seamless integration with Flair framework
- Efficient batch processing capabilities
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its use of document-level features through the FLERT architecture, which allows it to leverage broader context for more accurate entity recognition. Its impressive F1-score of 92.31% on the CoNLL-03 German dataset makes it one of the top performers for German NER.
Q: What are the recommended use cases?
The model is ideal for applications requiring German named entity recognition, such as information extraction, document analysis, customer data processing, and automated content tagging. It's particularly effective for scenarios requiring high accuracy in distinguishing between person names, locations, and organizations in German text.