Enhancing Plaque Segmentation in CCTA with Prompt- based Diffusion Data Augmentation

Authors: Ruan Yizhe, Xuangeng Chu, Ziteng Cui, Yusuke Kurose, JUNICHI IHO, Yoji Tokunaga, Makoto Horie, YUSAKU HAYASHI, Keisuke Nishizawa, Yasushi Koyama, Tatsuya Harada

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate Prompt Lesion on a private CCTA dataset and multi-organ tumor segmentation tasks (kidney, liver, pancreas) using public datasets, achieving superior performance compared to baseline methods. Models trained with our prompt-guided synthetic augmentation significantly improve Dice Similarity Coefficient (DSC) scores for both plaque and tumor segmentation. Extensive evaluations and ablation studies confirm the effectiveness of prompt conditioning.
Researcher Affiliation Academia Yizhe Ruan1,2, Xuangeng Chu1, Ziteng Cui1, Yusuke Kurose1,2, Junichi Iho3, Yoji Tokunaga3, Makoto Horie3, Yusaku Hayashi3, Keisuke Nishizawa3, Yasushi Koyama3,2, Tatsuya Harada1,2 1The University of Tokyo 2RIKEN Center for Advanced Intelligence Project 3Sakurabashi Watanabe Advanced Healthcare Hospital EMAIL
Pseudocode No The paper describes the methodology using textual explanations and figures, but it does not include a clearly labeled pseudocode block or algorithm.
Open Source Code Yes 1https://github.com/Ruan Yizhe-77/Prompt Lesion TMLR
Open Datasets Yes Public Tumor Datasets. For tumor segmentation, we used the same public datasets as Diff Tumor Chen et al. (2024): the Kidney Tumor dataset (Ki TS) Heller et al. (2020), the Liver Tumor dataset (Li TS) Bilic et al. (2023), and the Pancreas Tumor dataset (MSD Pancreas) Antonelli et al. (2022).
Dataset Splits Yes For plaque-related experiments, we used a private set of 100 cardiac CTA volumes acquired at the Sakurabashi Watanabe Advanced Healthcare Hospital. Following the data split ratio of a prior work Ruan et al. (2025), the cohort was divided into a training set of 70 patients, a validation set of 10, and a test set of 20. The annotation procedure was carried out by cardiac experts.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No Our training pipeline is implemented using the MONAI framework. No specific version number for MONAI or other key software libraries is provided.
Experiment Setup Yes The random seed for all experiments was set to 1234 to ensure reproducibility. Our training pipeline is implemented using the MONAI framework. To address class imbalance, we employed MONAI s class-aware cropping methods, specifically Rand Crop By Pos Neg Labeld and Rand Crop By Label Classesd, to ensure that training patches were sampled with a balanced representation of different lesion types. For data augmentation, we applied only random 90-degree rotations (Rand Rotate90d with a probability of 0.1). All models are trained for 500 epochs with early stopping on the validation set. We use the Dice CELoss Sudre et al. (2017) for all the segmentation trainings. The proportion of synthetic data is tuned on the validation set we found using a 1:1 ratio of synthetic: real achieved the best results without overwhelming the network with possibly redundant synthetic samples.