Parameter-Efficient Adaptation of LLMs for Clinical MRI Protocol Automation

This repository contains the complete source code and pipeline for the Master's thesis:
“Parameter-Efficient Adaptation of Open-Source Language Models for Clinical MRI Protocol Automation.”

This thesis examines the capability of modern open-source large language models (LLMs) to automate MRI protocol assignment. It focuses on adapting these models efficiently for this specialized clinical task while ensuring local deployment to maintain data privacy and transparency.

Workflow Overview

The project is structured as an end-to-end pipeline:

Data Preprocessing (data_processing.py)
Loads clinical data from Excel files and formats each case into a structured conversation:
- System message: Defines the task
- User message: Provides patient data and available MRI programs
- Assistant message: Contains the ground-truth sequences in JSON format
Model Adaptation (train.py)
Fine-tunes a pre-trained base LLM (e.g., Llama, MedGemma) using parameter-efficient fine-tuning (PEFT) methods, enabling the model to output clinically valid sequences in JSON.
Inference (inference.py)
Loads the fine-tuned (or pre-trained) model, processes new patient data, and generates MRI sequence recommendations.
Evaluation (evaluate.py)
Compares model outputs with expert-annotated ground truth using metrics such as exact fuzzy matching, edit distance, and semantic similarity (BioBERT and MiniLM).

Features

PEFT Methods: Supports LoRA, VeRA, and Prompt Tuning.
Model-Agnostic: Works with open-source decoder-only LLMs (e.g., Llama, Phi, MedGemma).
Main Loss Function: Cross Entropy loss.
Custom Loss Function: Implements CustomTrainer using Focal Loss and example-level weighting for rare medical sequences.
Structured Output: Forces models to produce clean, parsable JSON for automation.
Robust Evaluation: Multi-metric evaluation to measure clinical utility beyond simple accuracy.

Project Structure

peft-mri-protocol-automation/
├── configs/
│ └── config.yaml # Configuration file (paths, models, training)
├── data/
│ ├── train.xlsx # Training dataset
│ └── evaluation.xlsx # test dataset
├── model/
│ └── ... # Trained PEFT adapters
├── out/
│ └── ... # Logs and evaluation outputs
├── src/
│ ├── __init__.py
│ ├── main.py # Main entry point
│ ├── train.py # PEFT fine-tuning
│ ├── inference.py # Sequence generation
│ ├── evaluate.py # Evaluation metrics
│ ├── config.py # Configuration loader
│ ├── model_utils.py # PEFT configs, Focal Loss custom trainer
│ └── data_processing.py # Data loading, prompt formatting, tokenization
├── tests/# Unit tests
│ └── __init__.py
│ └── test_config.py
│ └── test_data_processing.py
│ └── test_evaluate.py
│ └── test_inference.py
│ └── test_model_utils.py
├── requirements.txt # Dependencies
└── README.md # This file

Getting Started

1️. Installation

Clone the repository:

git clone https://github.com/your-username/peft-mri-protocol-automation.git
cd peft-mri-protocol-automation

Install dependencies:

pip install -r requirements.txt

2. Configuration

All pipeline settings are controlled by configs/config.yaml. Before running the pipeline, edit this file:

base_project_dir: Set this to the absolute path of the cloned repository.
model_mapping: Update paths to your pre-trained base models (e.g., Llama-3.1-8B).
data_paths: Verify the names of your training and validation data files.

3. Data Format

The pipeline expects data in Excel files:

Training/Validation (train.xlsx, val.xlsx)
Must contain the following columns:
Indication, Symptoms, Age, Gender, Protocol
The Protocol column contains the ground-truth sequence list, e.g., "axial FLAIR, axial DWI".
Test (evaluation.xlsx)
Must contain: Indication, Symptoms, Age, Gender
The Protocol Zeynep and Protocol Ralf columns are used as ground truth.

4. Running the Pipeline

The main entry point is src/main.py. You can run training, inference, and evaluation using command-line arguments.

Example 1: Baseline (pre-trained) Inference

Run inference and evaluation using a pre-trained model without fine-tuning:

python src/main.py \
    --fine_tuning_method "none" \
    --model_name "llama" \
    --layers "qkvo" \
    --test_file "evaluation" \
    --model_folder "model" \
    --rank "2"

Example 2: Full Fine-Tuning Pipeline (LoRA)

This command will:

Train: Fine-tune the Llama model using LoRA with a rank of 2.
Infer: Run inference on the evaluation test file.
Evaluate: Calculate and print performance metrics.

python src/main.py \
    --fine_tuning_method "lora" \
    --model_name "llama" \
    --layers "qkvo" \
    --rank "2" \
    --test_file "evaluation" \
    --model_folder "model"

Main Arguments

Argument	Description
`--fine_tuning_method`	PEFT method to use: `none` — Runs baseline inference only `lora` — Low-Rank Adaptation `vera` — Vector-based Random Matrix Adaptation `prompt` — Prompt Tuning
`--model_name`	Key from `config.yaml` model mapping (e.g., `llama`, `qwen`, `gemma`)
`--layers`	Target modules for LoRA/VeRA (e.g., `qkvo`, `o`, `q`)
`--rank`	Rank for the PEFT method (e.g., 8, 16, 32)
`--test_file`	Name of the test file in your `/data/` directory (e.g., `evaluation`)
`--model_folder`	directory (e.g. model) to save/load the trained adapter

Evaluation

The evaluate.py script computes a wide range of metrics to provide a review of model performance:

Lexical Similarity: Metrics based on rapidfuzz (Fuzzy Ratio) and Levenshtein (Edit Distance) to account for minor spelling variations.
Semantic Similarity: Uses sentence-transformers (BioBERT and MiniLM) to compute cosine similarity between predicted and ground-truth sequences. This metric correctly identifies semantically equivalent sequences (e.g., "axial T1" vs "T1 axial").

Citation

If you use this repository, please cite:

Ganji, Z. (2025). Parameter-Efficient Adaptation of Open-Source Language Models for Clinical MRI Protocol Automation. Master Thesis, Rheinische Friedrich-Wilhelms-Universität Bonn, Computer Science.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parameter-Efficient Adaptation of LLMs for Clinical MRI Protocol Automation

Workflow Overview

Features

Project Structure

Getting Started

1️. Installation

2. Configuration

3. Data Format

4. Running the Pipeline

Example 1: Baseline (pre-trained) Inference

Example 2: Full Fine-Tuning Pipeline (LoRA)

Main Arguments

Evaluation

Citation

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
configs		configs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
bender_slurm.sh		bender_slurm.sh
noxfile.py		noxfile.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Parameter-Efficient Adaptation of LLMs for Clinical MRI Protocol Automation

Workflow Overview

Features

Project Structure

Getting Started

1️. Installation

2. Configuration

3. Data Format

4. Running the Pipeline

Example 1: Baseline (pre-trained) Inference

Example 2: Full Fine-Tuning Pipeline (LoRA)

Main Arguments

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages