Initial dataset used for model validation and optimization. Notice that, given randomness, differences in library versions and GNEprop library itself, results are slightly different compared to those reported in the manuscript, but the overall trends are consistent. 20 folds each.
| Checkpoint | Comment | AUPRC | AUROC | F1 |
|---|---|---|---|---|
20250801-153038 |
GNEprop (no pretraining, no RDKit features) | 0.486 | 0.807 | 0.456 |
20250801-154530 |
GNEprop (no pretraining, + RDKit features) | 0.497 | 0.825 | 0.433 |
20250801-150551 |
GNEprop (+ pretraining, no RDKit features) | 0.558 | 0.857 | 0.487 |
20250801-182534 |
GNEprop (+ pretraining, + RDKit features) | 0.560 | 0.848 | 0.482 |
We also release checkpoint 20250817-181637 which includes meta-learning-based fine-tuning on top of the
best model (not reported in the manuscript), further improving it (AUPRC = 0.560, AUROC = 0.873, F1 = 0.493)
Notice that GNEprop has not been explicitly hyper-optimized or evaluated on this dataset. 8 folds each.
20250819-085119: GNEprop trained on GNEtolC dataset with scaffold splitting20250819-093608: GNEprop trained on GNEtolC dataset with scaffold-cluster splitting
20210827-082422: self-supervised checkpoint
20250811-202022: GNEprop trained on the full HTS dataset (95/5 random splitting). 8 folds. In addition to self-supervised pre-training (as in the manuscript), this checkpoint also leverages adversarial augmentations (which gave an additional minor improvement on the public dataset). Given randomness and version changes, predictions are slightly different from those reported for the virtual hits, although highly correlated (Spearman correlation = 0.62 on the virtual hits subset).