roberta-large-ontonotes5

Property	Value
Model Base	RoBERTa Large
Task	Token Classification (NER)
Dataset	OntoNotes5
F1 Score	90.86%
Downloads	381,833

What is roberta-large-ontonotes5?

roberta-large-ontonotes5 is a specialized Named Entity Recognition (NER) model built on the RoBERTa-large architecture and fine-tuned on the OntoNotes5 dataset. The model excels at identifying 18 different types of entities, including persons, organizations, locations, and more specialized categories like cardinal numbers and works of art. With its impressive F1 score of 90.86%, it represents a robust solution for enterprise-grade NER tasks.

Implementation Details

The model implements a CRF layer on top of RoBERTa and was trained with carefully optimized hyperparameters, including a batch size of 64, learning rate of 1e-05, and 15 epochs of training. It uses a maximum sequence length of 128 tokens and incorporates gradient accumulation steps with proper warmup.

Uses CRF layer for improved sequence labeling
Implements micro and macro averaging for comprehensive evaluation
Achieves 92.84% F1 score on entity span detection
Supports 18 distinct entity types with varying performance levels

Core Capabilities

Outstanding performance on geopolitical areas (96.87% F1)
Strong person entity recognition (95.56% F1)
Reliable organization detection (92.27% F1)
Effective handling of numerical entities (86.05% F1 for cardinal numbers)
Robust money and percentage recognition (90.11% and 91.71% F1 respectively)

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its comprehensive entity coverage and high-performance metrics across different entity types. It particularly excels in recognizing geopolitical areas, persons, and organizations, making it suitable for complex NER tasks in various domains.

Q: What are the recommended use cases?

This model is ideal for applications requiring precise entity extraction from formal text, such as news analysis, legal document processing, and business intelligence. It's particularly strong in identifying organizational entities, personal names, and geographical references, making it suitable for information extraction in professional contexts.