Gretel GLiNER Bi-Small v1.0

Property	Value
License	Apache 2.0
Language	English
Task	Token Classification
F1 Score	0.94

What is gretel-gliner-bi-small-v1.0?

Gretel GLiNER Bi-Small v1.0 is a specialized fine-tuned model designed for detecting Personally Identifiable Information (PII) and Protected Health Information (PHI) in text documents. Based on the GLiNER architecture, this model has been specifically trained on the gretelai/gretel-pii-masking-en-v1 dataset to achieve high accuracy in identifying sensitive information across various document types.

Implementation Details

The model implements a sophisticated token classification architecture, achieving impressive metrics with 0.89 accuracy, 0.98 precision, and 0.91 recall. It's designed to process text and identify over 40 different types of sensitive information entities, making it particularly valuable for privacy-focused applications.

Fine-tuned on synthetic document snippets containing PII/PHI entities
Implements advanced entity recognition capabilities
Supports confidence threshold adjustment for predictions
Optimized for privacy compliance use cases

Core Capabilities

Detection of personal identifiers including names, addresses, and contact information
Recognition of medical record numbers and health-related information
Identification of financial data such as credit card numbers and routing information
Processing of government-issued identifiers and official documents
Support for both structured and unstructured text analysis

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on privacy-sensitive information detection, achieving high precision (0.98) while maintaining strong recall (0.91). It's specifically optimized for privacy compliance scenarios and supports a wide range of entity types.

Q: What are the recommended use cases?

The model is ideal for healthcare organizations requiring HIPAA compliance, financial institutions processing sensitive data, legal firms handling confidential documents, and any organization needing to ensure GDPR compliance through accurate PII detection and redaction.