2D Prostate Segmentation using Improved UNet on HipMRI Dataset

Project: #3 - HipMRI 2D Segmentation with Improved UNet

This project implements an Improved Unet architecture for automated prostate segmentation from MRI images using the HipMRI Study dataset. The goal is to achieve a Dice similarity coefficient of ≥ 0.75 on the prostate label (Class 3) in the test set.

Problem Description

Medical image segmentation is crucial for radiotherapy planning in prostate cancer. This project segments four anatomical regions from 2D magnetic resonance imaging (MRI) slices:

Class 0: Background
Class 1: Body
Class 2: Bone
Class 3: Prostate (Primary Target)

The Improved UNet architecture enhances the original UNet through architectural improvements.

Model Architecture

Improved UNet vs Standard UNet

The improved UNet incorporates several improvements over the original UNet:

Key Improvements:

Deeper Network: There are 5 levels of encoding/ decoding in Improved UNet, but only 4 in standard UNet.
Residual Connections: Skip connections using residual blocks for better gradient flow.
Instance Normalization: More stable than batch normalization for small batch sizes.
Leaky ReLU: Prevents the ReLU function from failing on negative slopes (alpha = 0.01).
Deep Supervision: Additional loss at intermediate decoder layers.
Context Module: Additional context aggregation at bottleneck.

Network Architecture:

Input: (N, 1, 256, 128) - Grayscale MRI images

[Encoder Path - Downsampling]
    Level 0: ResidualDoubleConv: 1 -> 64 channels (256×128)
        MaxPool2d(2×2)
    Level 1: ResidualDoubleConv: 64 -> 128 channels (128×64)
        MaxPool2d(2×2)
    Level 2: ResidualDoubleConv: 128 -> 256 channels (64×32)
        MaxPool2d(2×2)
    Level 3: ResidualDoubleConv: 256 -> 512 channels (32×16)
        MaxPool2d(2×2)
    Level 4 (Bottleneck): ResidualDoubleConv: 512 -> 1024 channels (16×8)

[Context Aggregation Module]
    Parallel dilated convolutions with rates [1, 2, 4, 8]
    Receptive fields: 3×3, 7×7, 15×15, 31×31
    Aggregated multi-scale features (1024 channels)

[Decoder Path with Deep Supervision]
    Level 3: TransposeConv + Skip + ResidualDoubleConv: 1024 -> 512 (32×16)
        ├─ Auxiliary Output: DSV4 (512 -> 4 classes)

    Level 2: TransposeConv + Skip + ResidualDoubleConv: 512 -> 256 (64×32)
        ├─ Auxiliary Output: DSV3 (256 -> 4 classes)

    Level 1: TransposeConv + Skip + ResidualDoubleConv: 256 -> 128 (128×64)
        ├─ Auxiliary Output: DSV2 (128 -> 4 classes)

    Level 0: TransposeConv + Skip + ResidualDoubleConv: 128 -> 64 (256×128)
        ├─ Auxiliary Output: DSV1 (64 -> 4 classes)

[Output Layer]
    1×1 Convolution: 64 -> 4 channels
    Output: (N, 4, 256, 128) - Class logits

Key Architectural Components

1. Residual Blocks

Two 3x3 convolutions with skip connections
Enables gradient flow in deep networks

2. Instance Normalization

Normalizes per sample
More stable than Batch Normalization for medical imaging

3. Context Aggregation

Parallel dilated convolutions at bottleneck
Captures features at multiple scales (3x3 to 31x31)

4. Deep Supervision

Auxiliary outputs at 5 decoder levels
Loss weights: 1.0, 0.8, 0.6, 0.4, 0.2

Dataset

Source: HipMRI Study on Prostate Cancer

Format: NIfTI (.nii.gz)

Data Splits:

Training: 11,460 slices
Validation: 660 slices
Testing: 540 slices

Preprocessing:

Load NIFTI files with nibabel
Resize to 256x128
Z-score normalization: '(img - mean) / std'
Clean invalid labels (≥4 -> class 0)
One-hot encode to 4 classes

Dependencies

torch>=2.0.0
numpy>=1.24.0
nibabel>=5.0.0
matplotlib>=3.7.0
opencv-python>=4.7.0
tqdm>=4.65.0

Project Structure

UNet_Prostate_47222610/
├── README.md                 # This file
├── dataset.py                # Data loading and preprocessing for MRI slices
├── modules.py                # Improved UNet architecture
├── predict.py                # Testing and visualization
├── train.py                  # Training with deep supervision
└── Result_Images/            # Visualization results
    ├── training_curves.png
    ├── prediction_batch_0.png
    ├── prediction_batch_1.png
    ├── prediction_batch_2.png
    ├── prediction_batch_3.png
    └── prediction_batch_4.png

Usage

Training

python train.py

Training parameters are hardcoded: 30 epochs, batch size 16, learning rate 1e-4.

Testing

python predict.py

Training Environment

Platform: Rangpur HPC (The University of Queensland)
GPU: NVIDIA A100
Training Time: ~2 hours for 30 epochs

Reproducibility

Training Configuration

Architecture: Improved UNet (5-level encoder/decoder)
Epochs: 30
Batch size: 16
Learning rate: 1e-4 (Adam optimizer)
Weight decay: 1e-5 (L2 regularization)
Loss function: CrossEntropyLoss + Deep Supervision
Image size: 256×128
Number of classes: 4

File Outputs Summary

After training and evaluation:

UNet_Prostate_47222610/
├── improved_unet_best.pth           # Best model 
├── improved_unet_final.pth          # Final model 
├── improved_unet_epoch_*.pth        # Checkpoints 
├── logs/
│   └── improved_unet_*.out          # Training logs (text)
└── Result_Images/
    ├── training_curves.png          # Loss/Dice plots
    └── prediction_batch_*.png       # Sample predictions

Results

Test Set Performance

Class	Region	Dice
0	Background	0.9881
1	Body	0.9842
2	Bone	0.9271
3	Prostate (Target)	0.9552

Project Requirement: Prostate Dice ≥ 0.75
Achievement: 0.9552 (Exceeds requirement by 27.4%)
Status: PASSED

Visualizations

Figure 1: Training loss and prostate Dice coefficient over 30 epochs.

Figure 2: Sample predictions on test set. Left: Input MRI, Center: Ground truth, Right: Model prediction.

References

Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., & Maier-Hein, K. H. (2018). "Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the BRATS 2017 Challenge." arXiv preprint arXiv:1802.10508.
Ronneberger, O., Fischer, P., & Brox, T. (2015). "U-Net: Convolutional Networks for Biomedical Image Segmentation." MICCAI 2015.
Yu, F., & Koltun, V. (2016). "Multi-Scale Context Aggregation by Dilated Convolutions." ICLR 2016.
COMP3710 Assignment Specification. The University of Queensland, 2025.

Academic Integrity

Code written independently following course materials and cited papers
AI tools (ChatGPT) were used to assist in understanding and to provide reference material for writing docstrings

Author

Student Name: Chia Jou Lu

Student ID: 47222610

Course: COMP3710 Pattern Recognition

Institution: The University of Queensland

Date: November 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2D Prostate Segmentation using Improved UNet on HipMRI Dataset

Problem Description

Model Architecture

Improved UNet vs Standard UNet

Network Architecture:

Key Architectural Components

Dataset

Dependencies

Project Structure

Usage

Training

Testing

Training Environment

Reproducibility

Training Configuration

File Outputs Summary

Results

Test Set Performance

Visualizations

References

Academic Integrity

Author

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

2D Prostate Segmentation using Improved UNet on HipMRI Dataset

Problem Description

Model Architecture

Improved UNet vs Standard UNet

Network Architecture:

Key Architectural Components

Dataset

Dependencies

Project Structure

Usage

Training

Testing

Training Environment

Reproducibility

Training Configuration

File Outputs Summary

Results

Test Set Performance

Visualizations

References

Academic Integrity

Author