clinicalnerpt-medical

Maintained By
pucpr

clinicalnerpt-medical

PropertyValue
AuthorPUCPR
PaperBioBERTpt Paper
Training DataSemClinBr Clinical Corpus
Training Parameters10 epochs, IOB2 format

What is clinicalnerpt-medical?

clinicalnerpt-medical is a specialized Named Entity Recognition (NER) model designed for identifying medical entities in Portuguese clinical texts. It's part of the larger BioBERTpt project, which comprises 13 different models trained to recognize clinical entities that are compatible with the Unified Medical Language System (UMLS). The model was developed by researchers at PUCPR using the Brazilian clinical corpus SemClinBr as training data.

Implementation Details

The model is built upon the BioBERTpt architecture, which itself is a transfer learning adaptation of the multilingual BERT model. The training process involved 10 epochs using the IOB2 (Inside, Outside, Beginning) tagging format, which is standard for NER tasks. The model benefits from domain-specific pre-training on biomedical and clinical texts in Portuguese.

  • Built on BioBERTpt architecture
  • Trained specifically on Brazilian Portuguese clinical texts
  • Implements IOB2 tagging format
  • Optimized for medical entity recognition

Core Capabilities

  • Recognition of clinical entities compatible with UMLS
  • Specialized processing of Portuguese medical terminology
  • Enhanced performance through domain-specific training
  • Reduced need for labeled data through transfer learning

Frequently Asked Questions

Q: What makes this model unique?

This model specifically addresses the gap in Portuguese clinical NLP tools, offering specialized medical entity recognition capabilities trained on Brazilian clinical texts. It's part of a larger ecosystem of clinical NER models and has demonstrated improved performance compared to baseline models.

Q: What are the recommended use cases?

The model is ideal for extracting medical entities from Portuguese clinical texts, electronic health records, and medical documentation. It's particularly useful for healthcare institutions and researchers working with Portuguese medical texts who need to automatically identify and classify clinical entities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.