PetBERT
Property | Value |
---|---|
Parameter Count | 108M |
Model Type | Masked Language Model |
License | OpenRAIL |
Paper | Scientific Reports |
Training Data | 500M+ words from veterinary records |
What is PetBERT?
PetBERT is an innovative language model specifically designed for veterinary medicine, built upon the BERT architecture and further trained on over 500 million words from UK veterinary clinicians. Developed by SAVSNET, it represents a significant advancement in automated veterinary health record processing and disease surveillance.
Implementation Details
The model was trained using an adaptation of the ULMFiT framework, building upon the BERT-base architecture. Training involved both Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks, conducted over 450 hours on an NVIDIA A100 GPU. The model processes clinical narratives with 15% masked words, learning to predict appropriate veterinary terminology and context.
- Fine-tuned from bert-base-uncased
- Trained on 5.1 million electronic health records
- Achieves F1 scores exceeding 83% for disease coding
Core Capabilities
- Automated ICD-11 syndromic disease coding
- Early disease outbreak detection
- Clinical narrative understanding
- Veterinary terminology prediction
Frequently Asked Questions
Q: What makes this model unique?
PetBERT is the first large-scale language model specifically trained on veterinary clinical records, enabling sophisticated understanding of animal health documentation and early disease detection capabilities.
Q: What are the recommended use cases?
The model is ideal for automated coding of veterinary records, disease surveillance, and clinical decision support in veterinary practices. It can process clinical narratives and identify potential disease outbreaks up to 3 weeks earlier than traditional methods.