Measure the predictive performance, privacy and explainability of your ML model.
Measure the predictive performance, privacy and explainability of an ML model using MLflow.
Currently we measure a model that predicts survival after being diagnosed with prostate cancer from the PLCO dataset. We then measure the
- predictive performance of the model (measured using scaled MCC, MCC (Matthews Correlation Coefficient), and accuracy),
- privacy (measured using PBI)
- explainability (measured using the monotonicity and non-sensitivity of the explanation and the fraction of features with real world meaning)
The run.py file contains all the functions to run the training and evaluation of the model.
To perform an evaluation choose the model (a scikit-learn estimator) and parameters, together with the data file and the column description file.
The option selection is in the test_run.ipynb file
The data file is the PLCO first cancer dataset. Unfortunately due to sharing restrictions it can't be provided in the repository, but access can be requested here, with request usually being processed within 2 weeks. The data was slightly modified to standardise non-response answers to NaN.
In our experiment, we filter only to include patients diagnosed with prostate cancer: fstcan_cancersite = 1.
Moreover the column information file columns_prostate.csv includes relevant information about features selected for the experiment.
Keep- A subset of all the features that could reasonably included. Does not include features obviously revealing the prediction target, and duplicates in rarely used encodings, as well as trial information. Is used as a baseline to calculate PBI.Keep Narrow- A subset of Keep, which is the current set of features being evaluated.Categorical- An indication whether a feature is categorical, as these are often represented with integers in the dataset. Marking them as categorical stops the features being incorrectly being interpreted as ordinal.Clinical Meaning- indicates whether a feature has clinical meaning or not. Used to derive the fraction of features with clinical meaning.Demographic- used to indicate if the given column contains demographic information.Patient history- used to indicate if the given column contains patient history (incomplete).
The output is logged using mlflow, so to activate the mlflow application you need to run mlflow server, specifiyng the port if necessary with mlflow server --port <port_number>.
Adam Harrison - the code. Work performed while working for IT Innovation, University of Southampton.
Chris Duckworth - supervision
For open source projects, say how it is licensed.

