deberta-v3-base_finetuned_ai4privacy_v2

Maintained By
Isotonic

deberta-v3-base_finetuned_ai4privacy_v2

PropertyValue
Parameter Count184M
LicenseCC-BY-NC-4.0
Base Modelmicrosoft/deberta-v3-base
Training Datasetai4privacy/pii-masking-200k
Overall F1 Score97.57%

What is deberta-v3-base_finetuned_ai4privacy_v2?

This is a specialized privacy-focused model fine-tuned on the world's largest open-source privacy dataset. Built on Microsoft's DeBERTa-v3-base architecture, it's designed to identify and mask personally identifiable information (PII) across 54 different classes of sensitive data.

Implementation Details

The model has been trained using advanced optimization techniques, including Adam optimizer with carefully tuned parameters (betas=0.96,0.996) and a cosine learning rate scheduler with warmup. Training was conducted over 7 epochs with a batch size of 32 and achieved remarkable accuracy across various PII categories.

  • Comprehensive coverage of 54 PII classes
  • Trained on diverse interaction styles including casual conversation, formal documents, and emails
  • Optimized for business, education, psychology, and legal fields
  • Achieves 99.15% overall accuracy

Core Capabilities

  • Perfect (100%) accuracy in detecting emails, cryptocurrency addresses, URLs, and usernames
  • Exceptional performance in identifying personal information (names, addresses, phone numbers)
  • Strong financial data detection (credit card numbers, account details)
  • Robust handling of various date formats and location information

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its comprehensive coverage of 54 PII classes and exceptional accuracy across diverse use cases. It's particularly notable for achieving perfect F1 scores in several critical categories while maintaining high performance across the board.

Q: What are the recommended use cases?

The model is ideal for privacy-sensitive applications such as document anonymization, AI assistant privacy enhancement, and automated PII removal in text processing pipelines. It's particularly well-suited for business, legal, and educational contexts where data privacy is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.