stanford-deidentifier-only-radiology-reports

stanford-deidentifier-only-radiology-reports

StanfordAIMI

A specialized transformer-based model for de-identifying radiology reports, achieving 97.9+ F1 scores across institutions. Built by Stanford AIMI for medical privacy.

PropertyValue
LicenseMIT
FrameworkPyTorch + Transformers
Base ArchitecturePubMedBERT (uncased)
Primary TaskToken Classification

What is stanford-deidentifier-only-radiology-reports?

This is a specialized AI model developed by Stanford AIMI for automatically de-identifying sensitive information in radiology reports. It combines transformer-based architecture with "hide in plain sight" rule-based methods to detect and replace protected health information (PHI) while maintaining document readability.

Implementation Details

The model was trained on a diverse dataset of 6,193 documents, including chest X-ray and CT reports, achieving remarkable F1 scores: 97.9 on known institution reports, 99.6 on new institution reports, and high performance on i2b2 benchmarks. It utilizes PubMedBERT as its foundation and implements sophisticated token classification techniques.

  • Built on PubMedBERT architecture with specialized training for medical text
  • Combines transformer learning with rule-based methods
  • Trained on multi-institutional data for robust performance
  • Implements synthetic PHI generation for enhanced training

Core Capabilities

  • Accurate detection and replacement of PHI in medical documents
  • Cross-institutional compatibility
  • Superior performance compared to existing de-identification tools
  • Realistic surrogate replacement for removed PHI
  • 99.1% recall in detecting core PHI spans

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its hybrid approach combining transformers with rule-based methods, achieving state-of-the-art performance that exceeds both existing tools and human labelers on standard benchmarks.

Q: What are the recommended use cases?

The model is specifically designed for de-identifying radiology reports and other medical documents in clinical and research settings where maintaining patient privacy is crucial while preserving document utility.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026