OpenPhenom
Property | Value |
---|---|
Model Type | Vision Transformer CA-MAE |
Developer | Recursion Pharma |
License | Non-Commercial End User License Agreement |
Paper | Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology |
Training Hardware | Nvidia H100 Hopper nodes |
What is OpenPhenom?
OpenPhenom is a sophisticated channel-agnostic image encoding model designed specifically for microscopy image analysis. Built on a ViT-S/16 encoder backbone, it utilizes channelwise cross-attention over patch tokens to create contextualized representations for each channel independently. The model was trained on three significant datasets: RxRx3, JUMP-CP overexpression, and JUMP-CP gene-knockouts.
Implementation Details
The model employs a masked autoencoder (MAE) architecture optimized for CellPainting assay microscopy images. It processes multiple channels independently, making it highly versatile for various microscopy applications. The model requires post-processing techniques like PCA-CenterScale or Typical Variation Normalization for optimal performance.
- Channel-agnostic processing capability (1-11 channels)
- 384-dimensional output embeddings per channel
- Supports both single and multi-channel microscopy images
- Includes full MAE encoder and decoder architecture
Core Capabilities
- Generation of biologically useful embeddings from microscopy images
- Creation of contextualized embeddings for each microscopy channel
- Prediction of new channels/stains for incomplete CellPainting images
- Support for fine-tuning in downstream classification tasks
Frequently Asked Questions
Q: What makes this model unique?
OpenPhenom's channel-agnostic architecture and ability to process various microscopy image types make it unique. It's specifically optimized for CellPainting assay images and can handle variable channel inputs, making it highly versatile for biological research.
Q: What are the recommended use cases?
The model is best suited for large-scale microscopy image analysis, particularly with CellPainting assay data. It's ideal for creating embeddings for downstream analysis, predicting missing channels, and generating feature representations for cellular biology research. However, it may have limitations with single-plate experiments or non-microscopy images.