WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation

Authors: Zesen Cheng, Peng Jin, Hao Li, Kehan Li, Siheng Li, Xiangyang Ji, Chang Liu, Jie Chen

IJCAI 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With our Wi Co, several prominent top-down and bottom-up combinations achieve remarkable improvements on three common datasets with reasonable extra costs, which justifies effectiveness and generality of our method. 4 Experiments Our model is evaluated on three standard referring image segmentation datasets: Ref COCO [Yu et al., 2016], Ref COCO+ [Yu et al., 2016] and Ref COCOg [Mao et al., 2016].
Researcher Affiliation Academia 1 School of Electronic and Computer Engineering, Peking University, Shenzhen, China 2 AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China 3 Peng Cheng Laboratory, Shenzhen, China 4 Tsinghua University, Beijing, China EMAIL, EMAIL EMAIL, EMAIL
Pseudocode No None found.
Open Source Code No None found.
Open Datasets Yes Our model is evaluated on three standard referring image segmentation datasets: Ref COCO [Yu et al., 2016], Ref COCO+ [Yu et al., 2016] and Ref COCOg [Mao et al., 2016].
Dataset Splits No Our model is evaluated on three standard referring image segmentation datasets: Ref COCO [Yu et al., 2016], Ref COCO+ [Yu et al., 2016] and Ref COCOg [Mao et al., 2016]. The data preprocessing operations are in line with the original implementation of those selected methods.
Hardware Specification Yes We train our models for 5,000 iterations on an NVIDIA V100 with a batch size of 24.
Software Dependencies No Adam W [Loshchilov and Hutter, 2017] is adopted as our optimizer, and the learning rate and weight decay are set to 1e-5 and 5e-2.
Experiment Setup Yes Adam W [Loshchilov and Hutter, 2017] is adopted as our optimizer, and the learning rate and weight decay are set to 1e-5 and 5e-2. We train our models for 5,000 iterations on an NVIDIA V100 with a batch size of 24. To binarize the probability map and get segmentation results, the threshold τ is set to 0.35 to calibrate previous works [Ding et al., 2021].