SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection

Authors: Xing Liufu, Chaolei Tan, Xiaotong Lin, Yonggang Qi, Jinxuan Li, Jian-Fang Hu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on BSDS500, Muticue and NYUDv2 validate our model s superiority.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Beijing University of Posts and Telecommunications, Beijing, China 3Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, Guangzhou, China 4 Guangdong Province Key Laboratory of Information Security Technology, China EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the method using equations and textual descriptions but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not explicitly state that source code is available, nor does it provide a link to a code repository.
Open Datasets Yes We conduct experiments on three widely-used edge detection datasets: BSDS500 (Arbelaez et al. 2010), Multicue (M ely et al. 2016) and NYUDv2 (Silberman et al. 2012). For data augmentation, we adopt the same strategy as UAED (Zhou et al. 2023) across both datasets. BSDS500 consists of 500 natural images, with 200 for training, 100 for validation, and the remaining for test. Each image has 4 to 9 manual annotations. Additionally, the PASCAL VOC set (Everingham et al. 2010) with 10,103 images is used as supplementary training data, with edge annotations derived from semantic masks using Laplacian detector.
Dataset Splits Yes BSDS500 consists of 500 natural images, with 200 for training, 100 for validation, and the remaining for test. ... Multicue includes 100 images from complex natural scenes... We randomly split these images into training and evaluation sets, with 80 images for training and 20 for testing. ... NYUDv2 is a dataset for indoor scene parsing and edge detection, containing 1,449 paired RGB-D images. Each image has a single ground-truth edge map, with the dataset split into 381 training, 414 validation, and 654 testing images.
Hardware Specification Yes All experiments were conducted on RTX 3090, where training the model on BSDS500 requires approximately 20 GPU hours and 16GB GPU memory.
Software Dependencies No We implement SAUGE based on Py Torch (Paszke et al. 2019), and use SAM pre-trained on the SA-1B dataset (Kirillov et al. 2023) as our backbone. The Adam optimizer (Kinga, Adam et al. 2015) is used to update all parameters.
Experiment Setup Yes The learning rate is initialized as 1e-4 with step scheduling and weight decay is set to 5e-4. For BSDS, we set the ζ for the thresholding label to 0.2, and the model was trained for 6 epochs with a batch size of 3. For Multicue, we randomly crop the images into 512 512 and set ζ to 0.3. The model is trained for 20 epochs using a batch size of 3. ... In this work, we fix λ = 0.1, β = 0.5.