iDPA: Instance Decoupled Prompt Attention for Incremental Medical Object Detection

Authors: Huahui Yi, Wei Xu, Ziyuan Qin, Xi Chen, Xiaohu Wu, Kang Li, Qicheng Lao

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental experiments demonstrate that i DPA outperforms existing SOTA methods, with FAP improvements of 5.44%, 4.83%, 12.88%, and 4.59% in full data, 1-shot, 10-shot, and 50-shot settings, respectively.
Researcher Affiliation Academia 1West China Biomedical Big Data Center, West China Hospital, Sichuan University 2School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China 3Case Western Reserve University 4Sports Medicine Center, Department of Orthopedics and Orthopedic Research Institute, West China Hospital, West China School of Medicine, Sichuan University 5Department of Orthopedics and Orthopedic Research Institute, West China Hospital, Sichuan University 6Beijing University of Posts and Telecommunications 7Sichuan University Pittsburgh Institute. Correspondence to: Kang Li <EMAIL>, Qicheng Lao <EMAIL>.
Pseudocode No The paper describes methods through textual explanation and mathematical formulations (e.g., Eq. (1)-(13)) and theoretical analysis (Section A), but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https: //github.com/Harvey Yi/i DPA.git.
Open Datasets Yes To ensure a comprehensive evaluation, we collected 13 MOD tasks (Jha et al., 2020; Boccardi et al., 2015; Cassidy et al., 2021; Liu et al., 2020; Gong et al., 2021; Setio et al., 2017; Vu et al., 2019) from publicly available datasets for IMOD, named ODin M-13. This benchmark evaluates model performance in real medical scenarios, covering 8 imaging modalities across 9 organs. ... We collected 13 public datasets from the internet: DFUC-2020 (DFUC) (Cassidy et al., 2021), Kvasir (Jha et al., 2020), Optic Nerv (Optic N) 1, BCCD 2, CPM-17 (Vu et al., 2019), Breast Cancer (Breast C) 3, TBX11K (Liu et al., 2020), Kidney Tumor (Kidney T) 4, Luna16 (Setio et al., 2017), ADNI (Boccardi et al., 2015), Meningioma (Meneng) 5, Breast Tumor (Breast T) 6, and TN3K (Gong et al., 2021).
Dataset Splits No The paper mentions few-shot settings (k=1, 10, 50, ensuring each class has at least k objects) and full data settings, and describes the creation of the ODin M-13 benchmark from existing datasets. However, it does not explicitly provide percentages or counts for training, validation, and testing splits for each task within these datasets, nor does it reference predefined splits with citations or specific file names for custom splits.
Hardware Specification Yes For full-data training, we use four NVIDIA 3090 GPUs with a batch size of 4, while for few-shot training, we use a single NVIDIA 3090 GPU with a batch size of 1.
Software Dependencies No The proposed method is implemented in Python using the Py Torch library and runs on a PC. The code is based on the official GLIP (Li et al., 2022) implementation 8, and its environment requirements remain unchanged.
Experiment Setup Yes All experiments employ Adam W (Loshchilov, 2017) with a multistep learning rate scheduler. The learning rate is set to 0.1, and weight decay is set to 1e-4. The experiments run on 4 GPUs with a batch size of 1 per GPU for 5 epochs, with a learning rate decay of 0.1 at epoch 3. All results are averaged over 3 random seeds, with the task order determined by the seed. ... For all prompt-based CL methods ..., the initial learning rate is set to 1e-2, whereas Zi Ra ... uses an initial learning rate of 1e-3. Standard fine-tuning (FT), joint training, sequential training, Wi SE-FT ..., and experience replay (ER) ... use an initial learning rate of 1e-4.