Add/cxr comprehensive tutorial and benchmarks for PyHealth paper by jhnwu3 · Pull Request #773 · sunlabuiuc/PyHealth

jhnwu3 · 2026-01-09T02:05:37Z

This pull request introduces several documentation improvements and adds a new benchmarking script for length of stay prediction using pandas. The most significant changes are the addition of interpretability documentation for Vision Transformers (ViT), new visualization utility documentation, expanded tutorial listings for image analysis, and a comprehensive benchmark script for MIMIC-IV data.

Documentation Enhancements for Interpretability and Vision Transformers (ViT):

Added a new example for Chefer's attention-based attribution for Vision Transformers to the interpretability documentation, including training and visualization steps for COVID-19 chest X-ray classification (docs/api/interpret.rst).
Documented new visualizati
This pull request introduces significant improvements to the documentation and benchmarking utilities for interpretability and image analysis in PyHealth, as well as adds a new benchmarking script for length-of-stay prediction using pandas. The main changes include expanded interpretability documentation (especially for Vision Transformers), a detailed API reference for visualization utilities, improved organization of image analysis tutorials, and the addition of a comprehensive benchmarking script for MIMIC-IV data processing.

Documentation Improvements

Expanded interpretability documentation in docs/api/interpret.rst to include a new ViT/Chefer attribution example, providing step-by-step guidance on using CheferRelevance for Vision Transformers and visualizing model attributions.
Added a new section for visualization utilities in docs/api/interpret.rst, introducing the pyhealth.interpret.utils module and its specialized support for Vision Transformer attribution visualizations.
Created a dedicated API reference file docs/api/interpret/pyhealth.interpret.utils.rst that details all visualization functions, normalization utilities, and ViT-specific visualization helpers, including example usage for both standard and ViT attributions.

Tutorial and Example Organization

Updated the image analysis section in docs/tutorials.rst to clarify that chest X-ray examples are located in the examples/cxr/ directory, and reorganized the list of example files for better clarity and coverage of new notebooks and scripts.

Benchmarking Utilities

Added a new script examples/benchmark_perf/benchmark_pandas_los.py that benchmarks length-of-stay prediction processing using pandas on MIMIC-IV data, mirroring the PyHealth LengthOfStayPredictionMIMIC4 task. The script includes patient-level processing, LOS categorization, memory and time tracking, and outputs detailed statistics and results.on utilities for attribution overlays and ViT-specific visualizations, with links to relevant utility functions (docs/api/interpret.rst, docs/api/interpret/pyhealth.interpret.utils.rst). [1] [2]

Expanded Tutorials and Example Listings:

Updated the image analysis tutorial table to include ViT training and interpretability, binary and multilabel classification notebooks, and saliency map examples for chest X-ray datasets (docs/tutorials.rst).

New Benchmark Script for Length of Stay Prediction:

Added examples/benchmark_perf/benchmark_pandas_los.py, a standalone script that benchmarks visit-level length of stay prediction on MIMIC-IV data using pandas. The script processes admissions, diagnoses, procedures, and prescriptions, categorizes LOS, tracks memory usage, and outputs summary statistics and results.

…example that were never fixed before

EricSchrock · 2026-01-20T01:25:06Z

I converted chestxray14_binary_classification.ipynb and chestxray14_multilabel_classification.ipynb to .py files in #777. I tried to use git mv to first move them to examples/cxr/ to avoid a conflict with this PR but looks like it didn't work. You can delete these two files, as if you merge them they will be duplicates of the .py versions.

No worries, I think the notebook version isn't necessarily a bad idea to have too.

EricSchrock · 2026-01-20T01:26:21Z

+   * - ``cxr/cnn_cxr.ipynb``
     - CNN for chest X-ray classification (notebook)
-   * - ``chestXray_image_generation_VAE.py``
+   * - ``cxr/chestxray14_binary_classification.ipynb``


Suggested change

* - ``cxr/chestxray14_binary_classification.ipynb``

* - ``cxr/chestxray14_binary_classification.py``

jhnwu3 · 2026-01-21T20:26:22Z

-                parse_options=pv.ParseOptions(delimiter=delimiter),
+                parse_options=pv.ParseOptions(
+                    delimiter=delimiter, newlines_in_values=True
+                ),


I do need a check from someone like @Logiquo if this will break some of the test cases/workflows since I changed this option to support reading notes.

This should be safe unless it breaks some weird testcase. In theory a well-formated csv file should not be affected by this.

…labuiuc#773) * init commit to add many new examples for paper release * more updates * remove random .png file * reorganization of example directory to reduce visual clutter * more updates to pathing of examples * more updates to the examples benchmark * more examples refactoring for cleanliness * newer and cleaner examples * little updates in multimodal sample generation * more updates to multimodal demo, and fixed some bugs int he stagenet example that were never fixed before * more code examples here for perf benchmarking and line count * more little details, lets see if it passes workflow * tutorial update requested * fix for parsing errors, found out I messed with the wrong raw processor Merged, will see if I need to fix if things break later.

jhnwu3 added 3 commits January 8, 2026 19:57

init commit to add many new examples for paper release

118ee4c

more updates

9a17156

remove random .png file

7083f97

jhnwu3 added the component: interpret Contribute a new interpretability method to PyHealth label Jan 9, 2026

reorganization of example directory to reduce visual clutter

539f85f

EricSchrock mentioned this pull request Jan 10, 2026

Tidy up the ChestX-ray14 dataset for the PyHealth 2.0 release #777

Merged

more updates to pathing of examples

88d891e

jhnwu3 added the documentation Improvements or additions to documentation label Jan 11, 2026

jhnwu3 added 7 commits January 11, 2026 13:07

more updates to the examples benchmark

6c790cb

more examples refactoring for cleanliness

f196f68

newer and cleaner examples

2341439

little updates in multimodal sample generation

c8ed698

merge conflicts from master fixed

41e0d8e

more updates to multimodal demo, and fixed some bugs int he stagenet …

40b4223

…example that were never fixed before

Merge branch 'master' into add/cxr_comprehensive_tutorial

f7ee791

EricSchrock mentioned this pull request Jan 17, 2026

[PyHealth2.0 Bounty] OMOP Readmission Prediction #788

Merged

jhnwu3 added 2 commits January 19, 2026 13:58

more code examples here for perf benchmarking and line count

1d503a1

more little details, lets see if it passes workflow

d3b0c50

EricSchrock reviewed Jan 20, 2026

View reviewed changes

Comment thread docs/tutorials.rst

EricSchrock reviewed Jan 20, 2026

View reviewed changes

Comment thread docs/tutorials.rst

jhnwu3 added 2 commits January 21, 2026 14:15

fix merge

60ee2cd

tutorial update requested

194fe95

jhnwu3 commented Jan 21, 2026

View reviewed changes

jhnwu3 requested a review from Logiquo January 21, 2026 20:27

fix for parsing errors, found out I messed with the wrong raw processor

2aa3544

jhnwu3 merged commit c5c01d9 into master Jan 22, 2026
1 check passed

jhnwu3 deleted the add/cxr_comprehensive_tutorial branch January 22, 2026 22:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add/cxr comprehensive tutorial and benchmarks for PyHealth paper#773

Add/cxr comprehensive tutorial and benchmarks for PyHealth paper#773
jhnwu3 merged 17 commits intomasterfrom
add/cxr_comprehensive_tutorial

jhnwu3 commented Jan 9, 2026 •

edited

Loading

Uh oh!

EricSchrock Jan 20, 2026

Uh oh!

jhnwu3 Jan 21, 2026

Uh oh!

Uh oh!

EricSchrock Jan 20, 2026

Uh oh!

Uh oh!

jhnwu3 Jan 21, 2026

Uh oh!

Logiquo Jan 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	* - ``cxr/chestxray14_binary_classification.ipynb``
	* - ``cxr/chestxray14_binary_classification.py``

Conversation

jhnwu3 commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EricSchrock Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

jhnwu3 Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EricSchrock Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhnwu3 Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Logiquo Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jhnwu3 commented Jan 9, 2026 •

edited

Loading

Logiquo Jan 22, 2026 •

edited

Loading