Skip to content

zashari/alzheimer-mri-processing-pipeline

License Version Python Cite Status

Alzheimer MRI Processing Pipeline

Alzheimer MRI Processing Pipeline

A complete, production-ready 3D NIfTI preprocessing pipeline for ADNI T1-weighted MRI data, designed to accelerate Alzheimer's disease research by automating the entire preprocessing workflow from raw NIfTI files to training-ready 2D image sequences.

Table of Contents


ADNI/IDA Compliance Notice

Project status: Stable release. Current version: v1.7.4.

This project is an independent, open-source effort. It is not affiliated with, endorsed by, or sponsored by the Alzheimer's Disease Neuroimaging Initiative (ADNI) or the Imaging Data Archive (IDA) operated by the Laboratory of Neuro Imaging (LONI). "ADNI" and "IDA-LONI" are trademarks of their respective owners and are used solely to indicate data compatibility.

This repository contains code only. It does not host or distribute ADNI data. Access to ADNI/IDA is governed by their Data Use Agreements (DUAs). Users are solely responsible for compliance with all applicable terms.

Before requesting data access, read the ADNI Data Use Agreement and then follow the ADNI data access instructions.

Requirements for ADNI Data Users

If you use this pipeline with ADNI data, you must comply with the ADNI DUA: https://ida.loni.usc.edu/collaboration/access/appLicense.jsp

Resources

By using this pipeline with ADNI data, you confirm that you have read, understood, and agree to comply with all terms of the ADNI Data Use Agreement.


Overview

This repository provides a complete 3D NIfTI preprocessing pipeline for ADNI T1-weighted MRI data, transforming raw medical imaging files into training-ready 2D image sequences optimized for temporal deep learning models (e.g., CNN+LSTM architectures), or even non-temporal deep learning models.

The pipeline was developed in early 2025 as part of a final-year thesis project, with the goal of reducing time spent on data wrangling so researchers can focus on modeling and experimentation.

Key Benefits

  • End-to-End Automation: Complete preprocessing workflow from raw NIfTI to training-ready images
  • Modular Design: Each stage can be run independently or as part of the full pipeline
  • Resume Capability: Long-running processes can be resumed from checkpoints
  • Cross-Platform: Windows (primary) and Unix-based systems (Linux/macOS)
  • Production-Ready: Comprehensive error handling, logging, and progress tracking
  • GPU Acceleration: Optimized for CUDA-enabled GPUs with automatic fallback to CPU

Features

  • Complete 3D NIfTI Processing Pipeline - From raw files to training-ready 2D sequences
  • Modular Stage Architecture - Run stages independently or end-to-end
  • GPU Acceleration - CUDA support with automatic CPU fallback
  • Resume Capability - Checkpoint-based resumption for long-running processes
  • Rich Console Output - Beautiful, informative progress indicators and summaries
  • JSON Reports - Detailed execution reports for each stage
  • Configuration Management - YAML-based configuration with CLI overrides
  • Convenience Scripts - Pre-built scripts for easy execution (Windows/Unix)
  • Template Support - Pre-configured templates for MNI brain and hippocampus ROI

Pipeline Architecture

The pipeline consists of 4 main stages, each with multiple substages:

Stage 1: Environment Setup

  • GPU detection and verification
  • Package dependency checking and installation
  • Performance testing and optimization

Stage 2: Data Preparation

  • Split: Stratified train/validation/test splitting
  • Analyze: Metadata analysis and statistics generation

Stage 3: NIfTI Processing

  • Skull Stripping: HD-BET-based brain extraction
  • Template Registration: ANTs-based MNI template alignment
  • Labelling: Temporal sequence organization
  • 2D Conversion: NIfTI to PNG conversion with slice extraction

Stage 4: Image Processing

  • Center Crop: Temporal sequence extraction and cropping
  • Image Enhancement: Grey Wolf Optimizer-based enhancement
  • Data Balancing: Augmentation and class balancing

Prerequisites

System Requirements

  • Python: 3.11 or higher (3.12 recommended)
  • Operating System: Windows 10/11, Linux, or macOS
  • RAM: Minimum 8GB (16GB+ recommended for large datasets)
  • Storage: Sufficient space for processed outputs (typically 2-3x input size)
  • GPU: Optional but recommended (NVIDIA GPU with CUDA support)

Required Software

External Dependencies (Optional)

  • HD-BET - Automatically installed via pip if needed
  • ANTs (antspyx) - Required for template registration stage
    • Install manually: pip install antspyx or via environment setup stage

Template Files

Before running the pipeline, download the required template files:

  1. MNI Brain Template:

    • Location: support_files/templates/mni-brain/
    • Download: MNI152_T1_1mm_brain.nii.gz
    • See support_files/templates/mni-brain/README.md for details
  2. Hippocampus ROI Mask:

    • Location: support_files/templates/hippocampal-roi/
    • Download: NeuroVault Image 448213
    • See support_files/templates/hippocampal-roi/README.md for details

Installation

  1. Clone the repository:

    git clone https://github.com/zashari/alzheimer-mri-processing-pipeline.git
    cd alzheimer-mri-processing-pipeline
  2. Create a virtual environment:

    # Windows
    python -m venv venv
    venv\Scripts\activate
    
    # Linux/macOS
    python3 -m venv venv
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Install the package in editable mode:

    pip install -e .
  5. Verify installation:

    adp --help

Post-Installation

  1. Download template files (see Prerequisites)
    • MNI brain template: Place in support_files/templates/mni-brain/MNI152_T1_1mm_brain.nii.gz
    • Hippocampus ROI: Place in support_files/templates/hippocampal-roi/hippho50.nii.gz
  2. Configure paths in configs/default.yaml:
    • Set paths.data_root to your raw dataset directory
    • Set paths.metadata_csv to your metadata CSV file path
    • Set paths.output_root (default: outputs)
  3. Run environment setup to verify GPU and dependencies:
    adp environment_setup setup --auto-install true

Quick Start

After installation and configuration, run the pipeline using the provided convenience scripts:

Option 1: Run Individual Stages (Recommended)

Execute stages one-by-one to monitor progress:

Windows:

scripts\windows_based_system\run_environment_setup.bat
scripts\windows_based_system\run_data_preparation.bat
scripts\windows_based_system\run_nifti_processing.bat
scripts\windows_based_system\run_image_processing.bat

Linux/macOS:

./scripts/unix_based_system/run_environment_setup.sh
./scripts/unix_based_system/run_data_preparation.sh
./scripts/unix_based_system/run_nifti_processing.sh
./scripts/unix_based_system/run_image_processing.sh

Option 2: Run Full Pipeline (Advanced)

For users with sufficient resources, run everything in one go:

Windows:

scripts\windows_based_system\run_full_pipeline.bat

Linux/macOS:

./scripts/unix_based_system/run_full_pipeline.sh

⚠️ Warning: The full pipeline runs all stages sequentially and may take several hours.

Using CLI for Advanced Control

If you need to customize parameters or use specific options, use the adp CLI command directly. See the Usage section below for details.


Usage

Using Convenience Scripts (Recommended)

The easiest way to run the pipeline is using the provided convenience scripts. They handle all the necessary commands automatically.

What Each Script Does

  • run_environment_setup - GPU detection, package installation, performance testing
  • run_data_preparation - Data splitting and metadata analysis
  • run_nifti_processing - All NIfTI processing substages (skull stripping → template registration → labelling → 2D conversion)
  • run_image_processing - All image processing substages (center crop → enhancement → balancing)
  • run_full_pipeline - Complete pipeline end-to-end (all stages sequentially)

For detailed script documentation, see scripts/README.md.

Using CLI for Advanced Control

If you need to customize parameters, use specific options, or run individual substages, use the adp command directly:

General Syntax

adp <stage> <action> [--substage <substage>] [options]

Available Stages & Actions

Environment Setup:

adp environment_setup verify                    # Quick verification
adp environment_setup setup --auto-install true # Full setup

Data Preparation:

adp data_preparation split                      # Split data - this will automatically generate manifests
adp data_preparation analyze                    # Analyze metadata
adp data_preparation manifests                  # Generate manifests only

NIfTI Processing:

# Test mode (process samples)
adp nifti_processing test --substage skull_stripping
adp nifti_processing test --substage template_registration

# Process all files
adp nifti_processing process --substage skull_stripping
adp nifti_processing process --substage template_registration
adp nifti_processing process --substage labelling
adp nifti_processing process --substage twoD_conversion

Image Processing:

adp image_processing process --substage center_crop
adp image_processing process --substage image_enhancement
adp image_processing process --substage data_balancing

Common CLI Options

  • --set <key=value> - Override configuration values (repeatable, use comma for arrays)
  • --config <path> - Specify custom configuration file
  • --debug - Enable debug output
  • --quiet - Suppress non-essential output
  • --dry-run - Show what would be done without executing
  • --seed <int> - Set random seed for reproducibility
  • --log-file <path> - Write logs to file

CLI Examples

# Custom split ratios (must sum to 1.0)
adp data_preparation split --set data_preparation.split_ratios=0.7,0.2,0.1

# Use CPU instead of GPU for skull stripping
adp nifti_processing process --substage skull_stripping --set nifti_processing.skull_stripping.device=cpu

# Custom output directory
adp image_processing process --substage center_crop --set paths.output_root=/custom/output/path

# Debug mode with custom seed
adp data_preparation split --debug --seed 12345

Configuration

The pipeline uses YAML-based configuration files with hierarchical override support. Configuration is loaded from multiple sources in this order (later sources override earlier ones):

  1. configs/default.yaml - Main configuration (paths, global settings)
  2. configs/stages/*.yaml - Stage-specific defaults
  3. User config file (via --config)
  4. Environment variables (ADP_* prefix)
  5. CLI overrides (via --set)

Essential Configuration

Before running the pipeline, configure these paths in configs/default.yaml:

paths:
  data_root: "path/to/raw/nifti_files"      # Your raw dataset directory (example: {your_project/datasets/ADNI_1_5_T/})
  output_root: "outputs"                     # Output directory (relative to project root)
  metadata_csv: "path/to/metadata.csv"       # Primary metadata CSV file path (example: {your_project/datasets/ADNI_1_5_T/metadata.csv})

Key Settings

Data Preparation:

  • required_visits: ["sc", "m06", "m12"] - Required visits for complete sequences
  • split_ratios: [0.7, 0.15, 0.15] - Train/Val/Test ratios (must sum to 1.0)
  • stratify_by: "Group" - Column name for stratification

NIfTI Processing:

  • device: "cuda" - Processing device ("cuda", "cpu", or "mps")
  • mni_template_path: Path to MNI brain template
  • hippocampus_roi_path: Path to hippocampus ROI mask

Image Processing:

  • augmentation_targets: Target counts per class for data balancing
  • gwo_iterations: Number of Grey Wolf Optimizer iterations for enhancement

Overriding Configuration

You can override any configuration value via CLI without editing files:

# Override split ratios
adp data_preparation split --set data_preparation.split_ratios=0.7,0.2,0.1

# Override device and other settings
adp nifti_processing process --substage skull_stripping \
  --set nifti_processing.skull_stripping.device=cpu \
  --set nifti_processing.skull_stripping.use_tta=true

# Use custom config file
adp data_preparation split --config my_custom_config.yaml

For complete configuration options, see the YAML files in configs/ directory.


Project Structure

alzheimer-mri-processing-pipeline/
├── configs/                 # Configuration files
│   ├── default.yaml        # Main configuration
│   └── stages/             # Stage-specific configurations
├── docs/                   # Documentation
│   ├── CHANGELOG.md        # Version history
│   └── THIRD_PARTY_NOTICES.md
├── scripts/                # Convenience scripts
│   ├── unix_based_system/  # Shell scripts (Linux/macOS)
│   └── windows_based_system/  # Batch scripts (Windows)
├── src/
│   └── data_processing/    # Main package
│       ├── cli.py          # CLI entry point
│       ├── config/         # Configuration management
│       ├── data_preparation/  # Data preparation stage
│       ├── environment_setup/  # Environment setup stage
│       ├── image_processing/   # Image processing stage
│       ├── nifti_processing/   # NIfTI processing stage
│       └── stages/         # Stage registry
├── support_files/
│   └── templates/          # Template files (MNI brain, ROI masks)
├── outputs/                # Generated outputs (gitignored)
├── .reports/               # Execution reports (gitignored)
├── pyproject.toml          # Package metadata and build configuration
├── requirements.txt        # Python dependencies
├── README.md               # This file
└── LICENSE                 # MIT License

Output Structure

After running the pipeline, outputs are organized as follows:

outputs/
├── 1_splitted_sequential/     # Data preparation outputs
│   ├── train/
│   ├── val/
│   └── test/
├── manifests/                  # CSV manifests (at output root)
│   ├── metadata_split.csv
│   ├── train.csv
│   ├── val.csv
│   └── test.csv
├── 2_skull_stripping/         # Skull stripping outputs
├── 3_optimal_slices/          # Template registration outputs
│   ├── axial/
│   ├── coronal/
│   ├── sagittal/
│   └── hippocampus_masks_3D/
├── 4_labelling/                # Labelling outputs
├── 5_twoD/                    # 2D conversion outputs
├── 6_center_crop/              # Center crop outputs
├── 7_enhanced/                 # Image enhancement outputs
└── 8_balanced/                # Data balancing outputs

.reports/                      # JSON execution reports (at project root)
├── environment_setup_*.json
├── data_preparation_*.json
├── nifti_processing_*.json
└── image_processing_*.json

.visualizations/                # Visualization outputs (at output root)
├── data_preparation/
├── nifti_processing/
│   ├── skull_stripping/
│   ├── template_registration/
│   ├── labelling/
│   └── twoD_conversion/
└── image_processing/
    ├── center_crop/
    ├── image_enhancement/
    └── data_balancing/

Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository and create a feature branch
  2. Follow code style - Use consistent formatting and naming conventions
  3. Add tests - Include tests for new features when possible
  4. Update documentation - Keep README and docstrings up to date
  5. Submit a Pull Request - Include a clear description of changes

Development Setup

# Clone your fork
git clone https://github.com/your-username/alzheimer-mri-processing-pipeline.git
cd alzheimer-mri-processing-pipeline

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or: venv\Scripts\activate on Windows

# Install dependencies
pip install -r requirements.txt
pip install -e .

# Make changes and test
adp --help

# Run tests (if available)
pytest

For more details, see CONTRIBUTING.md and CODE_OF_CONDUCT.md.


Citation

If you use this pipeline in your research, please cite it appropriately. This helps track the impact of this work and supports future development.

Quick Citation

BibTeX format:

@misc{alzheimer_mri_processing_pipeline_2025,
  title        = {alzheimer-mri-processing-pipeline},
  author       = {Ashari, Zaky and contributors},
  year         = {2025},
  publisher    = {GitHub},
  howpublished = {\url{https://github.com/zashari/alzheimer-mri-processing-pipeline}},
  note         = {Version v1.4.4}
}

APA format:

Ashari, Z. (2025). alzheimer-mri-processing-pipeline [Computer software]. GitHub. 
https://github.com/zashari/alzheimer-mri-processing-pipeline

Citation File

  • CITATION.cff — Citation File Format for automatic citation (GitHub will display a "Cite this repository" button)

References

This pipeline is built upon and informed by the following research papers and methods. When publishing work that uses this pipeline, please cite the relevant references:

Methodology Author(s) Links
Skull Stripping Druzhinina, P.; Kondrateva, E. Click here
Skull Stripping Isensee, F.; Schell, M.; Tursunova, I.; et al. Click here
Click here
Alzheimer's Disease ROI Hassouneh, A.; Bazuin, B.; Danna-Dos-Santos, A.; Acar, I.; Abdel-Qader, I.; ADNI Click here
Data Augmentation / Domain Adaptation Llambias, S. N.; Nielsen, M.; Mehdipour Ghazi, M. Click here
Image Enhancement / Optimization Mirjalili, S.; Mirjalili, S. M.; Lewis, A. Click here
Dataset ADNI Click here

Acknowledgments

This work was developed as part of a final-year thesis project. Special thanks to:

  • Dr. Dani Suandi, S.Si., M.Si. — Lecturer in Mathematics, Binus University; Lecturer, School of Computer Science, Binus University — for guidance and supervision throughout this project.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Third-party dependencies (e.g., HD-BET, ANTs) have their own licenses. Please review and comply with their respective license terms. See docs/THIRD_PARTY_NOTICES.md for details.


Additional Resources


For questions or support, please open an issue on GitHub or contact izzat.zaky@gmail.com with subject format: AD Pipeline Issue: {name the issue}.

About

A comprehensive Python pipeline for preprocessing Alzheimer's Disease MRI data. Optimized for ADNI (Alzheimer's Disease Neuroimaging Initiative) dataset format.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors