DCG_Enhanced_distilGPT2

This repository contains the implementation of the method described in our paper, "Divide and Conquer: Isolating Normal-Abnormal Attributes in Knowledge Graph-Enhanced Radiology Report Generation".

🔥 Environment Setup

To set up the necessary environment:

Clone the repository:

git clone https://github.com/yourusername/DCG_Enhanced_distilGPT2.git
cd DCG_Enhanced_distilGPT2

Install the latest PyTorch:

Visit PyTorch's official website to find the command suitable for your system configuration.

Install the required dependencies:
```
pip install -r requirements.txt
```

🛠️ Pre-trained Weights Preparation

Store all the pre-trained weights in the ./checkpoint/ directory. Below are the details and corresponding links for each:

BiomedCLIP (for offline retrieval)
- Hugging Face
MedSAM (for image encoder)
- GitHub
distilgpt2 (for text and node encoder)
- Hugging Face
chextbert and bert (for validation)
- chextbert:
  - GitHub
- bert:
  - Hugging Face

📚 Dataset Preparation

MIMIC-CXR Dataset:

MIMIC-CXR:
- Download from Physionet.
- Place the files in dataset/mimic_cxr/images. Ensure the path dataset/mimic_cxr_jpg/physionet.org/files/mimic-cxr-jpg/2.0.0/files exists. Note: This dataset requires authorization.
Chen et al. Labels for MIMIC-CXR:
- Download from one of the following sources:
  - R2Gen
- Place annotations.json in dataset/mimic_cxr. The path should be dataset/mimic_cxr/annotations.json.

IU X-Ray Dataset:

Chen et al. Labels and Chest X-Rays in PNG Format for IU X-Ray:
- Download from one of the following sources:
  - R2Gen
- Place the files into dataset/iu_x-ray. Ensure the paths dataset/iu_x-ray/annotations.json and dataset/iu_x-ray/images exist.

Note: The dataset directory can be configured for each task using the dataset_dir variable in config/train_mimic_cxr.yaml and config/train_iu_xray.yaml.

💡 Execution Steps

To run the project, follow these steps:

(Optional) Use BiomedCLIP to initialize image features and perform offline retrieval. The results have been pre-saved in ./dataset/iu_xray/annotation_top5.json and ./dataset/mimic_cxr/annotation_top5.json. For specific steps, refer to tools/offline_retrieval.
(Optional) Extract entities from the retrieved reports and initialize them as node features and adjacency matrices. Our pre-processed results are saved in ./dataset/iu_xray/node_mapping.json, node_features_gpt2.h5, adjacency_matrix_191, and ./dataset/mimic_cxr/adjacency_matrix_276. For specific steps, refer to tools/generate_graph.

Model training and validation:

python train_ver4_iu_xray.py

or

python train_ver4_mimic.py

Checkpoint and Generate report: Comming soon

Note: The complete execution steps, code for processing image and graph features (only for IU-Xray; MIMIC-CXR requires authorization), and the weights will be uploaded later.

💻 File Structure

See folder_structure.txt

Citation and Acknowledgements

If you find our work useful, please consider citing our paper:

Comming soon

This project is built upon cvt2distilgpt2 and MedSAM. We would like to thank them for their great work.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
checkpoint		checkpoint
config		config
dataset		dataset
segment_anything		segment_anything
tools		tools
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
folder_structure.txt		folder_structure.txt
medsam2distilgpt2_iu_xray.py		medsam2distilgpt2_iu_xray.py
medsam2distilgpt2_mimic_cxr.py		medsam2distilgpt2_mimic_cxr.py
requirements.txt		requirements.txt
train_ver4_iu_xray.py		train_ver4_iu_xray.py
train_ver4_mimic.py		train_ver4_mimic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DCG_Enhanced_distilGPT2

🔥 Environment Setup

🛠️ Pre-trained Weights Preparation

📚 Dataset Preparation

MIMIC-CXR Dataset:

IU X-Ray Dataset:

💡 Execution Steps

💻 File Structure

Citation and Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DCG_Enhanced_distilGPT2

🔥 Environment Setup

🛠️ Pre-trained Weights Preparation

📚 Dataset Preparation

MIMIC-CXR Dataset:

IU X-Ray Dataset:

💡 Execution Steps

💻 File Structure

Citation and Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages