en_core_med7_lg

Property	Value
Author	Andrey Kormilitzin
License	MIT
spaCy Version	>=3.4.2,<3.5.0
Vector Dimensions	300
Accuracy (F-Score)	87.70%

What is en_core_med7_lg?

en_core_med7_lg is a specialized medical natural language processing model designed for clinical text analysis. Developed by Andrey Kormilitzin, it's built on spaCy and features comprehensive word vectors with 514,157 unique keys. The model excels in named entity recognition (NER) for medical text, achieving an impressive F-score of 87.70%.

Implementation Details

The model implements a dual-component pipeline consisting of tok2vec and NER modules. It utilizes 300-dimensional word vectors and is optimized for medical text processing with specific focus on medication-related entities.

Pre-trained word vectors: 514,157 unique vectors
High-performance NER with 86.50% precision and 88.93% recall
Specialized for medical domain terminology

Core Capabilities

Recognition of 7 medical entities: DOSAGE, DRUG, DURATION, FORM, FREQUENCY, ROUTE, STRENGTH
Advanced token vectorization for medical terminology
Robust performance on clinical text analysis
Compatible with spaCy 3.4.2 ecosystem

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically tailored for medical text analysis with a focus on medication-related information extraction. Its high accuracy and specialized entity recognition capabilities make it particularly valuable for healthcare applications.

Q: What are the recommended use cases?

The model is ideal for processing clinical notes, medical records, and pharmaceutical documentation. It excels at extracting medication details including dosages, drug names, and administration instructions.

en_core_med7_lg

en_core_med7_lg

What is en_core_med7_lg?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models