roberta-ner-multilingual

roberta-ner-multilingual

julian-schelb

Multilingual NER model supporting 21 languages, fine-tuned on WikiANN dataset. 559M parameters, achieves 88.2% F1 score for entity recognition across languages.

PropertyValue
Parameter Count559M
LicenseMIT
Languages Supported21
Research PaperView Paper
F1 Score88.26%

What is roberta-ner-multilingual?

roberta-ner-multilingual is a powerful multilingual Named Entity Recognition (NER) model based on the XLM-RoBERTa architecture. It's designed to identify and classify named entities (persons, organizations, and locations) across 21 different languages, making it a versatile tool for multilingual text analysis.

Implementation Details

The model was fine-tuned on the WikiANN dataset, utilizing 375,100 training sentences and validated on 173,100 examples. It implements the IOB tagging format for entity classification and is built upon the XLM-RoBERTa architecture, which was pre-trained on 2.5TB of filtered CommonCrawl data.

  • Supports entity detection for PER (Person), ORG (Organization), and LOC (Location)
  • Achieves 90% F1 score for Location detection
  • 91.15% F1 score for Person detection
  • 82.91% F1 score for Organization detection

Core Capabilities

  • Multilingual support for 21 languages including English, German, French, Chinese, and more
  • High accuracy with 93.98% overall accuracy
  • Efficient token classification using the IOB2 format
  • Easy integration with HuggingFace Transformers library

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle 21 different languages while maintaining high accuracy (88.26% F1 score) makes it particularly valuable for multilingual NER tasks. It's built on the robust XLM-RoBERTa architecture and fine-tuned specifically for named entity recognition.

Q: What are the recommended use cases?

This model is ideal for multilingual information extraction, document analysis, and entity recognition in various languages. It's particularly useful for applications requiring cross-lingual entity detection in news articles, academic texts, and general content analysis.

Socials
Integrations
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026