phikon-v2

Maintained By
owkin

Phikon-v2

PropertyValue
Parameters303M
Model TypeVision Transformer (ViT-L/16)
LicenseOwkin non-commercial license
PaperarXiv
Training Data450M histology images (PANCAN-XL)

What is phikon-v2?

Phikon-v2 is an advanced Vision Transformer model specifically designed for histology image analysis. It represents a significant improvement over its predecessor, trained using the DINOv2 self-supervised learning method on PANCAN-XL, an extensive dataset of 450M histology images at 20x magnification. The model specializes in feature extraction from medical images, particularly for biomarker discovery and pathology analysis.

Implementation Details

The model utilizes a ViT-Large architecture with 303M parameters, featuring a patch size of 16 and embedding dimension of 1024. It was trained using PyTorch-FSDP mixed-precision on a cluster of 32x4 Nvidia V100 GPUs, completing training in approximately 4,300 GPU hours.

  • Self-distillation learning with DINOv2
  • Masked-image modeling with iBOT
  • KoLeo regularization on CLS tokens
  • Training batch size of 4,096

Core Capabilities

  • Feature extraction from histology images
  • ROI classification via linear/knn probing
  • Slide classification through multiple instance learning
  • Support for downstream segmentation tasks

Frequently Asked Questions

Q: What makes this model unique?

Phikon-v2 stands out for its extensive training on histological data (450M images from 60K whole slide images) and its specialized architecture optimized for medical image analysis. It incorporates multiple public datasets including TCGA, CPTAC, and GTEx, making it particularly robust for pathology applications.

Q: What are the recommended use cases?

The model excels in feature extraction for histology images, making it ideal for biomarker discovery, tumor classification, and pathology analysis. It can be used both as a feature extractor and fine-tuned for specific downstream tasks like slide classification or segmentation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.