Gretel GLiNER Base Model
Property | Value |
---|---|
License | Apache 2.0 |
Language | English |
Task | Token Classification |
Performance | 0.91 Accuracy, 0.95 F1 Score |
What is gretel-gliner-bi-base-v1.0?
Gretel GLiNER is a specialized fine-tuned model designed for detecting Personally Identifiable Information (PII) and Protected Health Information (PHI) in text documents. Built upon the GLiNER base model architecture, it has been specifically trained on the gretelai/gretel-pii-masking-en-v1 dataset to achieve superior performance in privacy-sensitive information detection.
Implementation Details
The model employs advanced token classification techniques and has been fine-tuned using a comprehensive synthetic dataset. It demonstrates impressive metrics with 0.98 precision and 0.92 recall, making it highly reliable for production environments.
- Supports 42+ entity types including medical records, personal identifiers, and financial information
- Implements confidence threshold-based entity prediction
- Python-based implementation using the GLiNER library
Core Capabilities
- Accurate detection of PII/PHI entities across various document types
- Support for healthcare, financial, and legal document processing
- Privacy-compliant information extraction and redaction
- Real-time entity recognition with configurable confidence thresholds
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on privacy-sensitive information detection, achieving superior performance metrics (0.91 accuracy, 0.95 F1 score) compared to base models. It's specifically designed for privacy compliance use cases across multiple industries.
Q: What are the recommended use cases?
The model excels in healthcare record processing, financial document analysis, cybersecurity log scanning, legal document processing, and GDPR/HIPAA compliance verification. It's particularly valuable for organizations needing to automatically identify and protect sensitive information.