SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection
Authors: Xing Liufu, Chaolei Tan, Xiaotong Lin, Yonggang Qi, Jinxuan Li, Jian-Fang Hu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on BSDS500, Muticue and NYUDv2 validate our model s superiority. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Beijing University of Posts and Telecommunications, Beijing, China 3Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, Guangzhou, China 4 Guangdong Province Key Laboratory of Information Security Technology, China EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the method using equations and textual descriptions but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not explicitly state that source code is available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We conduct experiments on three widely-used edge detection datasets: BSDS500 (Arbelaez et al. 2010), Multicue (M ely et al. 2016) and NYUDv2 (Silberman et al. 2012). For data augmentation, we adopt the same strategy as UAED (Zhou et al. 2023) across both datasets. BSDS500 consists of 500 natural images, with 200 for training, 100 for validation, and the remaining for test. Each image has 4 to 9 manual annotations. Additionally, the PASCAL VOC set (Everingham et al. 2010) with 10,103 images is used as supplementary training data, with edge annotations derived from semantic masks using Laplacian detector. |
| Dataset Splits | Yes | BSDS500 consists of 500 natural images, with 200 for training, 100 for validation, and the remaining for test. ... Multicue includes 100 images from complex natural scenes... We randomly split these images into training and evaluation sets, with 80 images for training and 20 for testing. ... NYUDv2 is a dataset for indoor scene parsing and edge detection, containing 1,449 paired RGB-D images. Each image has a single ground-truth edge map, with the dataset split into 381 training, 414 validation, and 654 testing images. |
| Hardware Specification | Yes | All experiments were conducted on RTX 3090, where training the model on BSDS500 requires approximately 20 GPU hours and 16GB GPU memory. |
| Software Dependencies | No | We implement SAUGE based on Py Torch (Paszke et al. 2019), and use SAM pre-trained on the SA-1B dataset (Kirillov et al. 2023) as our backbone. The Adam optimizer (Kinga, Adam et al. 2015) is used to update all parameters. |
| Experiment Setup | Yes | The learning rate is initialized as 1e-4 with step scheduling and weight decay is set to 5e-4. For BSDS, we set the ζ for the thresholding label to 0.2, and the model was trained for 6 epochs with a batch size of 3. For Multicue, we randomly crop the images into 512 512 and set ζ to 0.3. The model is trained for 20 epochs using a batch size of 3. ... In this work, we fix λ = 0.1, β = 0.5. |