FIRM: Flexible Interactive Reflection ReMoval

Authors: Xiao Chen, Xudong Jiang, Yunkang Tao, Zhen Lei, Qing Li, Chenyang Lei, Zhaoxiang Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive results on public real-world reflection removal datasets validate that our method demonstrates state-of-the-art reflection removal performance. Empirical results confirm that FIRM effectively improves reflection removal performance while requiring significantly less human guidance.
Researcher Affiliation Academia 1The Hong Kong Polytechnic University 2ETH Zurich 3Center for Artificial Intelligence and Robotics, HKISI-CAS 4 Institute of Automation, Chinese Academy of Sciences 5 University of Chinese Academy of Sciences
Pseudocode Yes Algorithm 1: Training data synthesis pipeline Input: Two clear RGB images T, R and its instance mask M Output: Blended image I, reflection instance mask Mr, contrastive points {ppos, pneg} 1: I Reflection Synthesis(T, R) 2: R threshold(I T, 0) # Residual map 3: max reflection value 0 4: for each instance Mi in M do 5: avg value MEAN(R Mi) 6: if avg value > max reflection value then 7: max reflection value avg value 8: Mr Mi R 9: end if 10: end for 11: Randomly sample a reflection point ppos from Mr 12: Randomly select a transmission point pneg from the neighbour of Mr 13: return I, Mr, {ppos, pneg}
Open Source Code No The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository. The citation to 'ar Xiv preprint ar Xiv:2406.01555' is for a different paper by some of the authors, not a code release for this work.
Open Datasets Yes Following the setting in (Hu and Guo 2023), the training data for reflection removal consists of 7,643 synthesized pairs from the PASCAL VOC dataset (Everingham et al. 2010) and 90 real pairs from (Zhang et al. 2018). The test data includes Real20 and SIR2 (Zhang et al. 2018; Wan et al. 2017). The SIR2 dataset (Wan et al. 2017) consists of three data splits: SIR2-Object, SIR2-Postcard, and SIR2-Wild... Since there is no public reflection segmentation dataset, we manually synthesize training data based on the COCO dataset (Lin et al. 2014).
Dataset Splits Yes Following the setting in (Hu and Guo 2023), the training data for reflection removal consists of 7,643 synthesized pairs from the PASCAL VOC dataset (Everingham et al. 2010) and 90 real pairs from (Zhang et al. 2018). The test data includes Real20 and SIR2 (Zhang et al. 2018; Wan et al. 2017). The SIR2 dataset (Wan et al. 2017) consists of three data splits: SIR2-Object, SIR2-Postcard, and SIR2-Wild... We evaluate the segmentation performance of the trained SARM on synthesized reflections using the COCO validation set (Lin et al. 2014) and real-world reflections from SIR2 (Wan et al. 2017).
Hardware Specification Yes Using point-based prompts, SARM is trained with a fixed learning rate of 0.0005 for 50 epochs on 8 NVIDIA A100 GPUs. The batch size is set as 8. The reflection removal network is optimized using the Adam optimizer for a total of 200,000 iterations, with a batch size of 8 on a single A100 GPU.
Software Dependencies No The proposed framework is implemented with Py Torch.
Experiment Setup Yes During the training phase of SARM, only the proposed modules are optimized. Using point-based prompts, SARM is trained with a fixed learning rate of 0.0005 for 50 epochs on 8 NVIDIA A100 GPUs. The batch size is set as 8. The reflection removal network is optimized using the Adam optimizer for a total of 200,000 iterations, with a batch size of 8 on a single A100 GPU. The initial learning rate is set to 10 3 and gradually reduce to 10 6 with the cosine annealing schedule (Loshchilov and Hutter 2016).