pl_core_news_lg

Property	Value
License	GPL 3.0
Vector Dimensions	300
Vocabulary Size	500,000 keys
spaCy Version	>=3.7.0,<3.8.0

What is pl_core_news_lg?

pl_core_news_lg is a comprehensive Polish language model developed for the spaCy framework, optimized for CPU usage. It represents a sophisticated natural language processing tool that combines high accuracy with extensive functionality for Polish text analysis.

Implementation Details

The model is built with a robust pipeline architecture including tok2vec, morphologizer, parser, lemmatizer, tagger, senter, and named entity recognition components. It features 500,000 unique word vectors with 300 dimensions, trained on a combination of the National Corpus of Polish, UD Polish PDB, and Explosion fastText vectors.

Named Entity Recognition: 84.74% precision, 83.56% recall
POS Tagging: 98.29% accuracy
Morphological Analysis: 90.98% accuracy
Dependency Parsing: 89.50% UAS, 82.38% LAS

Core Capabilities

Advanced morphological analysis with support for complex Polish grammar
Comprehensive named entity recognition for dates, geographic names, organizations, and person names
High-accuracy lemmatization (94.25%) and sentence segmentation (96.31% F-score)
Extensive dependency parsing with support for 63 dependency relations

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its comprehensive coverage of Polish language features, combining high accuracy across multiple NLP tasks with extensive vocabulary coverage and detailed morphological analysis capabilities specifically designed for Polish language complexities.

Q: What are the recommended use cases?

This model is ideal for advanced Polish text analysis tasks including detailed linguistic analysis, information extraction, text classification, and natural language understanding applications requiring deep grammatical and semantic processing of Polish text.

pl_core_news_lg

pl_core_news_lg

What is pl_core_news_lg?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models