A complete, production-ready 3D NIfTI preprocessing pipeline for ADNI T1-weighted MRI data, designed to accelerate Alzheimer's disease research by automating the entire preprocessing workflow from raw NIfTI files to training-ready 2D image sequences.
- ADNI/IDA Compliance Notice
- Overview
- Features
- Pipeline Architecture
- Prerequisites
- Installation
- Quick Start
- Usage
- Configuration
- Project Structure
- Output Structure
- Contributing
- Citation
- Acknowledgments
- License
- Additional Resources
Project status: Stable release. Current version: v1.7.4.
This project is an independent, open-source effort. It is not affiliated with, endorsed by, or sponsored by the Alzheimer's Disease Neuroimaging Initiative (ADNI) or the Imaging Data Archive (IDA) operated by the Laboratory of Neuro Imaging (LONI). "ADNI" and "IDA-LONI" are trademarks of their respective owners and are used solely to indicate data compatibility.
This repository contains code only. It does not host or distribute ADNI data. Access to ADNI/IDA is governed by their Data Use Agreements (DUAs). Users are solely responsible for compliance with all applicable terms.
Before requesting data access, read the ADNI Data Use Agreement and then follow the ADNI data access instructions.
If you use this pipeline with ADNI data, you must comply with the ADNI DUA: https://ida.loni.usc.edu/collaboration/access/appLicense.jsp
- ADNI Website: https://adni.loni.usc.edu/
- ADNI Data Use Agreement: https://ida.loni.usc.edu/collaboration/access/appLicense.jsp
- IDA-LONI Access Portal: https://ida.loni.usc.edu/
- ADNI Publication & Citation Guidelines: https://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Manuscript_Citations.pdf
By using this pipeline with ADNI data, you confirm that you have read, understood, and agree to comply with all terms of the ADNI Data Use Agreement.
This repository provides a complete 3D NIfTI preprocessing pipeline for ADNI T1-weighted MRI data, transforming raw medical imaging files into training-ready 2D image sequences optimized for temporal deep learning models (e.g., CNN+LSTM architectures), or even non-temporal deep learning models.
The pipeline was developed in early 2025 as part of a final-year thesis project, with the goal of reducing time spent on data wrangling so researchers can focus on modeling and experimentation.
- End-to-End Automation: Complete preprocessing workflow from raw NIfTI to training-ready images
- Modular Design: Each stage can be run independently or as part of the full pipeline
- Resume Capability: Long-running processes can be resumed from checkpoints
- Cross-Platform: Windows (primary) and Unix-based systems (Linux/macOS)
- Production-Ready: Comprehensive error handling, logging, and progress tracking
- GPU Acceleration: Optimized for CUDA-enabled GPUs with automatic fallback to CPU
- ✅ Complete 3D NIfTI Processing Pipeline - From raw files to training-ready 2D sequences
- ✅ Modular Stage Architecture - Run stages independently or end-to-end
- ✅ GPU Acceleration - CUDA support with automatic CPU fallback
- ✅ Resume Capability - Checkpoint-based resumption for long-running processes
- ✅ Rich Console Output - Beautiful, informative progress indicators and summaries
- ✅ JSON Reports - Detailed execution reports for each stage
- ✅ Configuration Management - YAML-based configuration with CLI overrides
- ✅ Convenience Scripts - Pre-built scripts for easy execution (Windows/Unix)
- ✅ Template Support - Pre-configured templates for MNI brain and hippocampus ROI
The pipeline consists of 4 main stages, each with multiple substages:
- GPU detection and verification
- Package dependency checking and installation
- Performance testing and optimization
- Split: Stratified train/validation/test splitting
- Analyze: Metadata analysis and statistics generation
- Skull Stripping: HD-BET-based brain extraction
- Template Registration: ANTs-based MNI template alignment
- Labelling: Temporal sequence organization
- 2D Conversion: NIfTI to PNG conversion with slice extraction
- Center Crop: Temporal sequence extraction and cropping
- Image Enhancement: Grey Wolf Optimizer-based enhancement
- Data Balancing: Augmentation and class balancing
- Python: 3.11 or higher (3.12 recommended)
- Operating System: Windows 10/11, Linux, or macOS
- RAM: Minimum 8GB (16GB+ recommended for large datasets)
- Storage: Sufficient space for processed outputs (typically 2-3x input size)
- GPU: Optional but recommended (NVIDIA GPU with CUDA support)
- HD-BET - Automatically installed via pip if needed
- ANTs (antspyx) - Required for template registration stage
- Install manually:
pip install antspyxor via environment setup stage
- Install manually:
Before running the pipeline, download the required template files:
-
MNI Brain Template:
- Location:
support_files/templates/mni-brain/ - Download: MNI152_T1_1mm_brain.nii.gz
- See
support_files/templates/mni-brain/README.mdfor details
- Location:
-
Hippocampus ROI Mask:
- Location:
support_files/templates/hippocampal-roi/ - Download: NeuroVault Image 448213
- See
support_files/templates/hippocampal-roi/README.mdfor details
- Location:
-
Clone the repository:
git clone https://github.com/zashari/alzheimer-mri-processing-pipeline.git cd alzheimer-mri-processing-pipeline -
Create a virtual environment:
# Windows python -m venv venv venv\Scripts\activate # Linux/macOS python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Install the package in editable mode:
pip install -e . -
Verify installation:
adp --help
- Download template files (see Prerequisites)
- MNI brain template: Place in
support_files/templates/mni-brain/MNI152_T1_1mm_brain.nii.gz - Hippocampus ROI: Place in
support_files/templates/hippocampal-roi/hippho50.nii.gz
- MNI brain template: Place in
- Configure paths in
configs/default.yaml:- Set
paths.data_rootto your raw dataset directory - Set
paths.metadata_csvto your metadata CSV file path - Set
paths.output_root(default:outputs)
- Set
- Run environment setup to verify GPU and dependencies:
adp environment_setup setup --auto-install true
After installation and configuration, run the pipeline using the provided convenience scripts:
Execute stages one-by-one to monitor progress:
Windows:
scripts\windows_based_system\run_environment_setup.bat
scripts\windows_based_system\run_data_preparation.bat
scripts\windows_based_system\run_nifti_processing.bat
scripts\windows_based_system\run_image_processing.batLinux/macOS:
./scripts/unix_based_system/run_environment_setup.sh
./scripts/unix_based_system/run_data_preparation.sh
./scripts/unix_based_system/run_nifti_processing.sh
./scripts/unix_based_system/run_image_processing.shFor users with sufficient resources, run everything in one go:
Windows:
scripts\windows_based_system\run_full_pipeline.batLinux/macOS:
./scripts/unix_based_system/run_full_pipeline.shIf you need to customize parameters or use specific options, use the adp CLI command directly. See the Usage section below for details.
The easiest way to run the pipeline is using the provided convenience scripts. They handle all the necessary commands automatically.
run_environment_setup- GPU detection, package installation, performance testingrun_data_preparation- Data splitting and metadata analysisrun_nifti_processing- All NIfTI processing substages (skull stripping → template registration → labelling → 2D conversion)run_image_processing- All image processing substages (center crop → enhancement → balancing)run_full_pipeline- Complete pipeline end-to-end (all stages sequentially)
For detailed script documentation, see scripts/README.md.
If you need to customize parameters, use specific options, or run individual substages, use the adp command directly:
adp <stage> <action> [--substage <substage>] [options]Environment Setup:
adp environment_setup verify # Quick verification
adp environment_setup setup --auto-install true # Full setupData Preparation:
adp data_preparation split # Split data - this will automatically generate manifests
adp data_preparation analyze # Analyze metadata
adp data_preparation manifests # Generate manifests onlyNIfTI Processing:
# Test mode (process samples)
adp nifti_processing test --substage skull_stripping
adp nifti_processing test --substage template_registration
# Process all files
adp nifti_processing process --substage skull_stripping
adp nifti_processing process --substage template_registration
adp nifti_processing process --substage labelling
adp nifti_processing process --substage twoD_conversionImage Processing:
adp image_processing process --substage center_crop
adp image_processing process --substage image_enhancement
adp image_processing process --substage data_balancing--set <key=value>- Override configuration values (repeatable, use comma for arrays)--config <path>- Specify custom configuration file--debug- Enable debug output--quiet- Suppress non-essential output--dry-run- Show what would be done without executing--seed <int>- Set random seed for reproducibility--log-file <path>- Write logs to file
# Custom split ratios (must sum to 1.0)
adp data_preparation split --set data_preparation.split_ratios=0.7,0.2,0.1
# Use CPU instead of GPU for skull stripping
adp nifti_processing process --substage skull_stripping --set nifti_processing.skull_stripping.device=cpu
# Custom output directory
adp image_processing process --substage center_crop --set paths.output_root=/custom/output/path
# Debug mode with custom seed
adp data_preparation split --debug --seed 12345The pipeline uses YAML-based configuration files with hierarchical override support. Configuration is loaded from multiple sources in this order (later sources override earlier ones):
configs/default.yaml- Main configuration (paths, global settings)configs/stages/*.yaml- Stage-specific defaults- User config file (via
--config) - Environment variables (
ADP_*prefix) - CLI overrides (via
--set)
Before running the pipeline, configure these paths in configs/default.yaml:
paths:
data_root: "path/to/raw/nifti_files" # Your raw dataset directory (example: {your_project/datasets/ADNI_1_5_T/})
output_root: "outputs" # Output directory (relative to project root)
metadata_csv: "path/to/metadata.csv" # Primary metadata CSV file path (example: {your_project/datasets/ADNI_1_5_T/metadata.csv})Data Preparation:
required_visits:["sc", "m06", "m12"]- Required visits for complete sequencessplit_ratios:[0.7, 0.15, 0.15]- Train/Val/Test ratios (must sum to 1.0)stratify_by:"Group"- Column name for stratification
NIfTI Processing:
device:"cuda"- Processing device ("cuda","cpu", or"mps")mni_template_path: Path to MNI brain templatehippocampus_roi_path: Path to hippocampus ROI mask
Image Processing:
augmentation_targets: Target counts per class for data balancinggwo_iterations: Number of Grey Wolf Optimizer iterations for enhancement
You can override any configuration value via CLI without editing files:
# Override split ratios
adp data_preparation split --set data_preparation.split_ratios=0.7,0.2,0.1
# Override device and other settings
adp nifti_processing process --substage skull_stripping \
--set nifti_processing.skull_stripping.device=cpu \
--set nifti_processing.skull_stripping.use_tta=true
# Use custom config file
adp data_preparation split --config my_custom_config.yamlFor complete configuration options, see the YAML files in configs/ directory.
alzheimer-mri-processing-pipeline/
├── configs/ # Configuration files
│ ├── default.yaml # Main configuration
│ └── stages/ # Stage-specific configurations
├── docs/ # Documentation
│ ├── CHANGELOG.md # Version history
│ └── THIRD_PARTY_NOTICES.md
├── scripts/ # Convenience scripts
│ ├── unix_based_system/ # Shell scripts (Linux/macOS)
│ └── windows_based_system/ # Batch scripts (Windows)
├── src/
│ └── data_processing/ # Main package
│ ├── cli.py # CLI entry point
│ ├── config/ # Configuration management
│ ├── data_preparation/ # Data preparation stage
│ ├── environment_setup/ # Environment setup stage
│ ├── image_processing/ # Image processing stage
│ ├── nifti_processing/ # NIfTI processing stage
│ └── stages/ # Stage registry
├── support_files/
│ └── templates/ # Template files (MNI brain, ROI masks)
├── outputs/ # Generated outputs (gitignored)
├── .reports/ # Execution reports (gitignored)
├── pyproject.toml # Package metadata and build configuration
├── requirements.txt # Python dependencies
├── README.md # This file
└── LICENSE # MIT License
After running the pipeline, outputs are organized as follows:
outputs/
├── 1_splitted_sequential/ # Data preparation outputs
│ ├── train/
│ ├── val/
│ └── test/
├── manifests/ # CSV manifests (at output root)
│ ├── metadata_split.csv
│ ├── train.csv
│ ├── val.csv
│ └── test.csv
├── 2_skull_stripping/ # Skull stripping outputs
├── 3_optimal_slices/ # Template registration outputs
│ ├── axial/
│ ├── coronal/
│ ├── sagittal/
│ └── hippocampus_masks_3D/
├── 4_labelling/ # Labelling outputs
├── 5_twoD/ # 2D conversion outputs
├── 6_center_crop/ # Center crop outputs
├── 7_enhanced/ # Image enhancement outputs
└── 8_balanced/ # Data balancing outputs
.reports/ # JSON execution reports (at project root)
├── environment_setup_*.json
├── data_preparation_*.json
├── nifti_processing_*.json
└── image_processing_*.json
.visualizations/ # Visualization outputs (at output root)
├── data_preparation/
├── nifti_processing/
│ ├── skull_stripping/
│ ├── template_registration/
│ ├── labelling/
│ └── twoD_conversion/
└── image_processing/
├── center_crop/
├── image_enhancement/
└── data_balancing/
Contributions are welcome! Please follow these guidelines:
- Fork the repository and create a feature branch
- Follow code style - Use consistent formatting and naming conventions
- Add tests - Include tests for new features when possible
- Update documentation - Keep README and docstrings up to date
- Submit a Pull Request - Include a clear description of changes
# Clone your fork
git clone https://github.com/your-username/alzheimer-mri-processing-pipeline.git
cd alzheimer-mri-processing-pipeline
# Create virtual environment
python -m venv venv
source venv/bin/activate # or: venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt
pip install -e .
# Make changes and test
adp --help
# Run tests (if available)
pytestFor more details, see CONTRIBUTING.md and CODE_OF_CONDUCT.md.
If you use this pipeline in your research, please cite it appropriately. This helps track the impact of this work and supports future development.
BibTeX format:
@misc{alzheimer_mri_processing_pipeline_2025,
title = {alzheimer-mri-processing-pipeline},
author = {Ashari, Zaky and contributors},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/zashari/alzheimer-mri-processing-pipeline}},
note = {Version v1.4.4}
}APA format:
Ashari, Z. (2025). alzheimer-mri-processing-pipeline [Computer software]. GitHub.
https://github.com/zashari/alzheimer-mri-processing-pipeline
CITATION.cff— Citation File Format for automatic citation (GitHub will display a "Cite this repository" button)
This pipeline is built upon and informed by the following research papers and methods. When publishing work that uses this pipeline, please cite the relevant references:
| Methodology | Author(s) | Links |
|---|---|---|
| Skull Stripping | Druzhinina, P.; Kondrateva, E. | Click here |
| Skull Stripping | Isensee, F.; Schell, M.; Tursunova, I.; et al. | Click here Click here |
| Alzheimer's Disease ROI | Hassouneh, A.; Bazuin, B.; Danna-Dos-Santos, A.; Acar, I.; Abdel-Qader, I.; ADNI | Click here |
| Data Augmentation / Domain Adaptation | Llambias, S. N.; Nielsen, M.; Mehdipour Ghazi, M. | Click here |
| Image Enhancement / Optimization | Mirjalili, S.; Mirjalili, S. M.; Lewis, A. | Click here |
| Dataset | ADNI | Click here |
This work was developed as part of a final-year thesis project. Special thanks to:
- Dr. Dani Suandi, S.Si., M.Si. — Lecturer in Mathematics, Binus University; Lecturer, School of Computer Science, Binus University — for guidance and supervision throughout this project.
This project is licensed under the MIT License. See the LICENSE file for details.
Third-party dependencies (e.g., HD-BET, ANTs) have their own licenses. Please review and comply with their respective license terms. See docs/THIRD_PARTY_NOTICES.md for details.
- Changelog - Version history and release notes
- Security Policy - Security reporting and supported versions
- Contributing Guidelines - How to contribute to this project
- Scripts Documentation - Detailed script usage guide
For questions or support, please open an issue on GitHub or contact izzat.zaky@gmail.com with subject format: AD Pipeline Issue: {name the issue}.