This is the implementation of Online Iterative Self-Alignment for Radiology Report Generation.
- pip install -r requirement.txt
- We use two datasets (IU X-Ray and MIMIC-CXR) in our paper.
For IU X-Ray, you can download the dataset from here .
For MIMIC-CXR, you can download the dataset from here .You can apply the dataset here with your license of PhysioNet.
- Preference DataSets:
For SFT1: data\mimic_cxr\FINAL2\SFT1
For SFT2: data\mimic_cxr\FINAL2\SFT2
For SFT3: data\mimic_cxr\FINAL2\SFT3
Run bash \scripts\mimic_cxr\SFT1\mimic_cxr_sft1.sh to train a model on the SFT-1 data.
Run bash \scripts\mimic_cxr\SFT2\mimic_cxr_sft2.sh to train a model on the SFT-2 data.
Run bash \scripts\mimic_cxr\SFT3\mimic_cxr_sft3.sh to train a model on the SFT-3 data.
You can download the pre-trained model for CheXbert from here: Chexbert.
For using RadGraph, you can refer to the following link: RadGraph. The specific model checkpoint can be downloaded from here: model_checkpoint. Place the related files in my MPO_IU\MPO_TRAIN\RadGraph directory
For using GREEN metric, you can refer to the following link: [GREEN]([Stanford-AIMI/GREEN: EMNLP, Findings 2024] a radiology report generation metric that leverages the natural language understanding of language models to identify and explain clinically significant errors in candidate reports)
For using RadCliQ metric, you can refer to the following link: RadCliQ
Run bash test_iu_xray.sh to test a model on the IU X-Ray data.
Run bash test_mimic_cxr.sh to test a model on the MIMIC-CXR data.