Gretel GLiNER Bi-Small v1.0
Property | Value |
---|---|
License | Apache 2.0 |
Language | English |
Task | Token Classification |
F1 Score | 0.94 |
What is gretel-gliner-bi-small-v1.0?
Gretel GLiNER Bi-Small v1.0 is a specialized fine-tuned model designed for detecting Personally Identifiable Information (PII) and Protected Health Information (PHI) in text documents. Based on the GLiNER architecture, this model has been specifically trained on the gretelai/gretel-pii-masking-en-v1 dataset to achieve high accuracy in identifying sensitive information across various document types.
Implementation Details
The model implements a sophisticated token classification architecture, achieving impressive metrics with 0.89 accuracy, 0.98 precision, and 0.91 recall. It's designed to process text and identify over 40 different types of sensitive information entities, making it particularly valuable for privacy-focused applications.
- Fine-tuned on synthetic document snippets containing PII/PHI entities
- Implements advanced entity recognition capabilities
- Supports confidence threshold adjustment for predictions
- Optimized for privacy compliance use cases
Core Capabilities
- Detection of personal identifiers including names, addresses, and contact information
- Recognition of medical record numbers and health-related information
- Identification of financial data such as credit card numbers and routing information
- Processing of government-issued identifiers and official documents
- Support for both structured and unstructured text analysis
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on privacy-sensitive information detection, achieving high precision (0.98) while maintaining strong recall (0.91). It's specifically optimized for privacy compliance scenarios and supports a wide range of entity types.
Q: What are the recommended use cases?
The model is ideal for healthcare organizations requiring HIPAA compliance, financial institutions processing sensitive data, legal firms handling confidential documents, and any organization needing to ensure GDPR compliance through accurate PII detection and redaction.