en_core_web_trf

Property	Value
License	MIT
Author	Explosion AI
Base Architecture	RoBERTa-base
spaCy Compatibility	≥3.7.2, <3.8.0

What is en_core_web_trf?

en_core_web_trf is a powerful English language transformer-based pipeline model built on the RoBERTa architecture. It represents spaCy's state-of-the-art offering for English language processing, combining high accuracy with comprehensive language understanding capabilities.

Implementation Details

The model is implemented using a transformer architecture based on RoBERTa-base, featuring byte-BPE tokenization with a vocabulary size of 50,265 tokens. It employs a sophisticated pipeline including transformer, tagger, parser, named entity recognizer, attribute ruler, and lemmatizer components.

Transformer Configuration: 768-dimensional embeddings with 144 token window size
Named Entity Recognition F-score: 90.19%
Part-of-Speech Tagging Accuracy: 98.13%
Dependency Parsing (LAS): 93.91%

Core Capabilities

Named Entity Recognition with 18 entity types
Part-of-Speech Tagging with 50+ tag classes
Dependency Parsing with 45 dependency labels
Sentence Boundary Detection (90.11% F-score)
Lemmatization and Attribute Assignment

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional accuracy across multiple NLP tasks, particularly in POS tagging (98.13%) and NER (90.19% F-score). It's built on the robust RoBERTa architecture and trained on high-quality datasets including OntoNotes 5.

Q: What are the recommended use cases?

The model excels in production environments requiring high-accuracy language understanding, including document analysis, information extraction, and text analytics. It's particularly suitable for applications needing precise entity recognition, syntactic analysis, or detailed linguistic annotation.

en_core_web_trf

en_core_web_trf

What is en_core_web_trf?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models