Skip to content

Add CaliForest model, data loader, task, and calibration metrics#971

Open
cgigliob wants to merge 1 commit intosunlabuiuc:masterfrom
cgigliob:master
Open

Add CaliForest model, data loader, task, and calibration metrics#971
cgigliob wants to merge 1 commit intosunlabuiuc:masterfrom
cgigliob:master

Conversation

@cgigliob
Copy link
Copy Markdown

CaliForest: Calibrated Random Forest for Health Data

Contributors:

Contribution Type: Full Pipeline (Dataset + Task + Model)

Paper: Y. Park and J. C. Ho. "CaliForest: Calibrated Random Forest for Health Data." ACM CHIL, 2020. https://doi.org/10.1145/3368555.3384461

Description:
Reimplementation of CaliForest within PyHealth. CaliForest uses out-of-bag prediction variance with an Inverse-Gamma prior to learn a calibrated random forest without requiring a held-out calibration set. Supports isotonic, logistic, and beta calibration. Includes dataset loader for MIMIC-Extract, parameterized task for 4 binary prediction targets, and 6 calibration metrics.

Files to review:

New files:

  • pyhealth/models/califorest.py — CaliForest model (BaseModel)
  • pyhealth/datasets/califorest_mimic_extract.py — MIMIC-Extract loader (BaseDataset)
  • pyhealth/datasets/configs/califorest_mimic_extract.yaml — Dataset config
  • pyhealth/tasks/mimic_extract_califorest.py — Task (BaseTask)
  • pyhealth/metrics/califorest_calibration.py — 6 calibration metrics
  • tests/test_califorest.py — Model tests (7)
  • tests/test_calibration_metrics.py — Metrics tests (11)
  • tests/test_mimic_extract_califorest_task.py — Task tests (7)
  • tests/test_califorest_mimic_extract_dataset.py — Dataset tests (6)
  • examples/califorest_mimic_extract_mortality_califorest.py — Ablation script
  • docs/api/models/pyhealth.models.CaliForest.rst
  • docs/api/datasets/pyhealth.datasets.CaliForestMIMICExtractDataset.rst
  • docs/api/tasks/pyhealth.tasks.MIMICExtractCaliForestTask.rst

Modified files:

  • docs/api/models.rst — Added toctree entry
  • docs/api/datasets.rst — Added toctree entry
  • docs/api/tasks.rst — Added toctree entry

31 tests passing. All use synthetic data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant