clinicalnerpt-medical
Property | Value |
---|---|
Author | PUCPR |
Paper | BioBERTpt Paper |
Training Data | SemClinBr Clinical Corpus |
Training Parameters | 10 epochs, IOB2 format |
What is clinicalnerpt-medical?
clinicalnerpt-medical is a specialized Named Entity Recognition (NER) model designed for identifying medical entities in Portuguese clinical texts. It's part of the larger BioBERTpt project, which comprises 13 different models trained to recognize clinical entities that are compatible with the Unified Medical Language System (UMLS). The model was developed by researchers at PUCPR using the Brazilian clinical corpus SemClinBr as training data.
Implementation Details
The model is built upon the BioBERTpt architecture, which itself is a transfer learning adaptation of the multilingual BERT model. The training process involved 10 epochs using the IOB2 (Inside, Outside, Beginning) tagging format, which is standard for NER tasks. The model benefits from domain-specific pre-training on biomedical and clinical texts in Portuguese.
- Built on BioBERTpt architecture
- Trained specifically on Brazilian Portuguese clinical texts
- Implements IOB2 tagging format
- Optimized for medical entity recognition
Core Capabilities
- Recognition of clinical entities compatible with UMLS
- Specialized processing of Portuguese medical terminology
- Enhanced performance through domain-specific training
- Reduced need for labeled data through transfer learning
Frequently Asked Questions
Q: What makes this model unique?
This model specifically addresses the gap in Portuguese clinical NLP tools, offering specialized medical entity recognition capabilities trained on Brazilian clinical texts. It's part of a larger ecosystem of clinical NER models and has demonstrated improved performance compared to baseline models.
Q: What are the recommended use cases?
The model is ideal for extracting medical entities from Portuguese clinical texts, electronic health records, and medical documentation. It's particularly useful for healthcare institutions and researchers working with Portuguese medical texts who need to automatically identify and classify clinical entities.