nbailab-base-ner-scandi

Maintained By
saattrupdan

nbailab-base-ner-scandi

PropertyValue
Model Size676 MB
Processing Speed4.16 samples/second
Average F1 Score89.08%
Authorsaattrupdan
Model HubHugging Face

What is nbailab-base-ner-scandi?

nbailab-base-ner-scandi is a sophisticated Named Entity Recognition (NER) model specifically designed for Scandinavian languages. Built on the NbAiLab/nb-bert-base architecture, it supports Danish, Norwegian (both Bokmål and Nynorsk), Swedish, Icelandic, and Faroese. The model identifies four types of entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC).

Implementation Details

The model was trained using carefully selected hyperparameters, including a learning rate of 2e-05, batch size of 32, and Adam optimizer. It was fine-tuned on multiple datasets including DaNE, NorNE, SUC 3.0, and WikiANN, achieving state-of-the-art performance across all supported languages.

  • Training conducted over 14 epochs with linear learning rate scheduling
  • Implements gradient accumulation steps of 4
  • Achieves superior performance while maintaining a smaller model size compared to alternatives

Core Capabilities

  • Multi-language support across major Scandinavian languages
  • High accuracy with 87.44% F1-score for Danish, 91.06% for Norwegian Bokmål, and 88.37% for Swedish
  • Efficient processing at 4.16 samples per second
  • Reasonable performance on English text due to cross-lingual training

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its comprehensive coverage of Scandinavian languages while maintaining state-of-the-art accuracy. It's significantly smaller (676MB) and faster than competitors like da_dacy_large_trf (2,090MB), making it more practical for production deployments.

Q: What are the recommended use cases?

The model is ideal for applications requiring named entity recognition in Scandinavian languages, such as information extraction, content analysis, and automated text processing systems. It's particularly effective for organizations working with multi-lingual Scandinavian content.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.