We are witnessing the emergence of many new feature extractors trained using self-supervised learning on large pathology datasets. This repository aims to provide a comprehensive list of these models, alongside key information about them.
I aim to update this list as new models are released, but please submit a pull request / issue for any models I have missed!
| Name | Group | Weights | Released | SSL | WSIs | Tiles | Patients | Batch size | Iterations | Architecture | Parameters | Embed dim | Input size | Magnification | Dataset | Links |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CTransPath | Sichuan University / Tencent AI Lab | ✅ | Dec 2021* | SRCL | 32K | 16M | Swin-Transformer | 28M | 768 | 224 | ~4-10x** | TCGA, PAIP | ||||
| RetCCL | Sichuan University / Tencent AI Lab | ✅ | Dec 2021* | CCL | 32K | 16M | 11K | 2048 | 100 epochs | ResNet-50 | 26M | 2048 | 224 | ~4-10x** | TCGA, PAIP | |
| REMEDIS | Google Research | ✅ | May 2022* | SimCLR/BiT | 29K | 50M | 11K cases | 4096 | 1.2M | ResNet-50 | 26M | 2048 | 224 | multi-scale | TCGA | |
| HIPT | Mahmood Lab | ✅ | Jun 2022* | DINOv1 | 11K | 100M | 256 | 400K | ViT-S | 22M | 384 | 256 | ~18-28x** | TCGA | ||
| Lunit-DINO | Lunit | ✅ | Dec 2022* | DINOv1 | 21K | ViT-S | 22M | 384 | 224 | ~9-28x** | TCGA | |||||
| Lunit-{BT,MoCoV2,SwAV} | Lunit | ✅ | Dec 2022* | {BT,MoCoV2,SwAV} | 21K | ResNet-50 | 2048 | 224 | ~9-62x** | TCGA | ||||||
| Phikon | Owkin | ✅ | Jul 2023* | iBOT | 6.1K | 43M | 5.6K | 1440 | 155K | ViT-B | 86M | 768 | 224 | ~20-35x** | TCGA | |
| CONCH (VL) | Mahmood Lab | ✅ | Jul 2023* | iBOT & vision-language pretraining | 21K | 16M | 1024 | 80 epochs | ViT-B | 86M | 768 | 224 | proprietary | |||
| UNI | Mahmood Lab | ✅ | Aug 2023* | DINOv2 | 100K | 100M | ViT-L | 1024 | 224 | ~9-25x** | proprietary (Mass-100K) | |||||
| Virchow | Paige / Microsoft | ✅ | Sep 2023* | DINOv2 | 1.5M | 120K | ViT-H | 632M | 2560 | 224 | ~20-35x** | proprietary (from MSKCC) | ||||
| Campanella et al. (DINO) | Thomas Fuchs Lab | ✅ | Oct 2023* | DINOv1 | 420K | 3.3B | 77K | 1080 | 1.3K INE | ViT-S | 22M | 384 | 224 | ~20-32x** | proprietary (MSHS) | |
| Campanella et al. (MAE) | Thomas Fuchs Lab | ❌ | Oct 2023* | MAE | 420K | 3.3B | 77K | 1440 | 2.5K INE | ViT-L | 303M | 1024 | 224 | ~20-45x** | proprietary (MSHS) | |
| Path Foundation | ✅ | Oct 2023* | SimCLR, MSN | 6K | 60M | 1024 | ViT-S | 22M | 384 | 224 | ~2-32x** | TCGA | ||||
| PathoDuet | Shanghai Jiao Tong University | ✅ | Dec 2023* | inspired by MoCoV3 | 11K | 13M | 2048 | 100 epochs | ViT-B | 4096 | 224 | ~18-124x** | TCGA | |||
| RudolfV | Aignostics | ❌ | Jan 2024* | DINOv2 | 130K | 750M | 36K | ViT-L | 300M | 224 | ~18-31x** | proprietary (from EU & US), TCGA | ||||
| kaiko | kaiko.ai | ✅ | Mar 2024* | DINOv2 | 29K | 260M** | 512 | 200 INE | ViT-L | 1024 | 224 | ~4-62x** | TCGA | |||
| PLUTO | PathAI | ❌ | May 2024* | DINOv2 (+ MAE and Fourier loss) | 160K | 200M | FlexiViT-S | 22M | 224 | ~5-71x** | proprietary (PathAI) | |||||
| BEPH | Shanghai Jiao Tong University | ✅ | May 2024* | BEiTv2 | 12K | 12M | 1024 | ViT-B | 193M | 1024 | 224 | ~40-89x** | TCGA | |||
| Prov-GigaPath | Microsoft / Providence | ✅ | May 2024* | DINOv2 | 170K | 1.4B | 30K | 384 | ViT | 1536 | 224 | ~18-31x** | proprietary (Providence) | |||
| Hibou-B | HistAI | ✅ | Jun 2024* | DINOv2 | 1.1M | 510M | 310K cases | 1024 | 500K | ViT-B | 86M | 768 | 224 | ~20-35x** | proprietary | |
| Hibou-L | HistAI | ✅ | Jun 2024* | DINOv2 | 1.1M | 1.2B | 310K cases | 1024 | 1.2M | ViT-L | 304M | 1024 | 224 | ~20-35x** | proprietary | |
| H-optimus-0 | Bioptimus | ✅ | Jul 2024* | DINOv2/iBOT | 500K (across 4,000 clinics) | >100M | 200K | ViT-G with 4 registers | 1.1B | 1536 | 224 | ~20-35x** | proprietary | |||
| mSTAR (VL) | Smart Lab | ❌ | Jul 2024* | mSTAR (multimodal) | 10K | 10K | ViT-L | 224 | TCGA | |||||||
| GPFM | Smart Lab | ✅ | Jul 2024* | DINOv2 (+ expert KD) | 72K | 190M | 1536 | 500K | ViT-L | 307M | 1024 | 224 | ~9-31x** | TCGA, GTEx, CPTAC, + 30 public datasets | ||
| Virchow 2 | Paige / Microsoft | ✅ | Aug 2024* | DINOv2 (+ ECT and KDE) | 3.1M | 2B | 230K | 4096 | ViT-H with 4 registers | 632M | 3584 | 224 | ~5-42x** | proprietary (from MSKCC and international sites) | ||
| Virchow 2G | Paige / Microsoft | ❌ | Aug 2024* | DINOv2 (+ ECT and KDE) | 3.1M | 2B | 230K | 3072 | ViT-G with 8 registers | 1.9B | 3584 | 224 | ~5-42x** | proprietary (from MSKCC and international sites) | ||
| Phikon-v2 | Owkin | ✅ | Sep 2024* | DINOv2 | 58.4K | 456M | 4096 | 250K | ViT-L | 307M | 1024 | 224 | ~20-35x** | PANCAN-XL (TCGA, CPTAC, GTEx, proprietary) | ||
| CONCH1.5a (VL) | Mahmood Lab | ✅ | Nov 2024* | CoCa (vision-language), UNI-initialized | 1.3M | 256 | 20 epochs | ViT-L/16 | 306M** | 768 | 448 | proprietary | ||||
| MUSKV (VL) | Li Lab (Stanford) | ✅ | Jan 2025* | Unified masked modeling (MLM, MIM) + contrastive learning | 33K | 50M | 12K | 2048 | 20 epochs | BEiT3 | 384 | ~10-40x** | TCGA | |||
| Atlas | Mayo, Charité, Aignostics | ❌ | Jan 2025* | 1.2M | 3.4B | 490K cases | ViT-H | 632M | ~4-62x** | |||||||
| UNI2-h | Mahmood Lab | ✅ | Jan 2025* | DINOv2 | 350K | 200M | ViT-H with 8 registers | 681M | 1536 | 224 | ~9-25x** | proprietary (Mass) | ||||
| UNI2-g-preview | Mahmood Lab | ❌ | Jan 2025* | DINOv2 | 350K | 200M | ViT-G | ~9-25x** | proprietary (Mass) | |||||||
| PathOrchestra | Shanghai AI Lab | ✅ | Mar 2025* | DINOv2 | 290K | 140M | 41K cases | 3072 | 20 epochs | ViT-L | 304M | 1024 | 224 | ~18-25x** | proprietary, TCGA | |
| H-optimus-1 | Bioptimus | ✅ | Apr 2025* | 1M+ (across >4K clinics) | 800K | ViT-g/14 | 1.1B | 1536 | 224 | proprietary |
Notes:
- Models marked with VL indicate language-vision pretraining (others are vision-only)
- Models trained on >100K slides may be considered foundation models and are marked in bold
- # of WSIs, tiles, and patients are reported to 2 significant figures
- INE = ImageNet epochs
- Order is chronological
- Some of these feature extractors have been evaluated in a benchmarking study for whole slide classification here.
- ** means inferred from other numbers provided in the paper, repository, or elsewhere
- a CONCH1.5 was released as the patch encoder for TITAN, which is a slide-level foundation model. It is also the patch encoder of THREADS.
- Input size refers to the size of the input image at inference time (in pixels)
- Magnification is the effective magnification at which the model sees tissue during pretraining, accounting for any resizing, cropping, or other augmentations. For example, if patches are obtained at 20x with patch size 1024 but resized to 224 before being fed to the model, the effective magnification is 20x × (224/1024) ≈ 4.4x. We assume the following MPP-to-magnification relationship: 0.5 MPP = 20x, 1 MPP = 10x (i.e., MPP × magnification = 10). Often times, models may use random crops; in this case we estimate based on the parameters of the random crop augmentation.
This table includes models that produce slide-level or patient-level embeddings without supervision.
| Name | Group | Weights | Released | SSL | WSIs | Patients | Batch size | Iterations | Architecture | Parameters | Embed dim | Patch size | Dataset | Links |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GigaSSL | CBIO | ✅ | Dec 2022* | SimCLR | 12K | 1K epochs | ResNet-18 | 256 | 256 | TCGA | ||||
| PRISM (VL) | Paige / Microsoft | ✅ | May 2024* | contrastive (with language) | 590K (190K text reports) | 190K | 64 (x4) | 75K (10 epochs) | Perceiver + BioGPT | 1280 | 224 | proprietary | ||
| Prov-GigaPath | Microsoft / Providence | ✅ | May 2024* | DINOv2 | 170K | 30K | LongNet | 86M | 1536 | 224 | proprietary (Providence) | |||
| MADELEINE (VL) | Mahmood Lab | ✅ | Aug 2024* | contrastive (InfoNCE & OT) | 16K | 2K | 120 | 90 epochs | multi-head attention MIL | 512 | 256 | ACROBAT, BWH Kidney (proprietary) | ||
| CHIEF (VL) | Yu Lab | ✅ | Sep 2024* | |||||||||||
| COBRA | Kather Lab | ✅ | Nov 2024* | COBRA (MoCo-v3 in FM embedding space) | 3K | 2.8K | 1024 | 2K epochs | Mamba-2 + ABMIL | 15M | 768 | 224 | TCGA (BRCA, CRC, LUAD, LUSC, STAD) | |
| TITANV (VL) | Mahmood Lab | ✅ | Dec 2024* | iBOT | 340K | 1024 | 91K (270 epochs) | ViT (smaller) | 42M | 448 | Mass-340K (proprietary) | |||
| THREADS (WSI, RNA, DNA) | Mahmood Lab | ❌ | Jan 2025* | 47K | 1200 | up to 101 epochs | ViT-L | 224 | MBTG-47k (MGH, BWH, TCGA, GTEx) |