xlm-roberta-base-ner-silvanus
Property | Value |
---|---|
Parameter Count | 277M |
License | MIT |
Languages Supported | Indonesian, English, Spanish, Italian, Slovak |
Research Paper | View Paper |
Performance Metrics | F1: 0.923, Accuracy: 0.986 |
What is xlm-roberta-base-ner-silvanus?
xlm-roberta-base-ner-silvanus is a specialized Named Entity Recognition (NER) model built on the XLM-RoBERTa architecture. It's fine-tuned specifically for multilingual token classification, with particular emphasis on identifying locations, dates, and times across five different languages. The model achieves impressive performance metrics, including 91.89% precision and 92.31% F1 score.
Implementation Details
The model is built upon the xlm-roberta-base architecture and has been fine-tuned using carefully selected hyperparameters, including a learning rate of 2e-05 and a batch size of 8. Training was conducted over 3 epochs using the Adam optimizer with linear learning rate scheduling.
- Trained on Indonesian NER datasets with zero-shot transfer capabilities
- Supports token classification for locations (LOC), dates (DAT), and times (TIM)
- Utilizes advanced transfer learning techniques for cross-lingual performance
Core Capabilities
- Multilingual NER processing across 5 languages
- High-precision entity recognition (91.89%)
- Specialized in processing social media content
- Zero-shot transfer learning for non-Indonesian languages
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its ability to perform high-accuracy NER tasks across multiple languages while being trained primarily on Indonesian data. Its zero-shot transfer learning capabilities make it particularly valuable for multilingual applications.
Q: What are the recommended use cases?
The model is specifically designed for extracting location, date, and time information from social media content across multiple languages. It's particularly useful for applications requiring multilingual information extraction from informal text sources.