DenseSAM: Semantic Enhance SAM for Efficient Dense Object Segmentation

Authors: Linyun Zhou, Jiacong Hu, Shengxuming Zhang, Xiangtong Du, Mingli Song, Xiuming Zhang, Zunlei Feng

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on pathology images demonstrate that Dense SAM delivers remarkable performance with minimal training parameters, providing a cost-effective and efficient solution. Moreover, experiments on remote sensing images further validate its excellent scalability, making Dense SAM suitable for various dense object segmentation domains. Extensive experiments showing Dense SAM s state-of-the-art performance with minimal parameters and strong generalization to new tasks.
Researcher Affiliation Academia Linyun Zhou1 , Jiacong Hu1 , Shengxuming Zhang1,2 , Xiangtong Du3 , Mingli Song1 , Xiuming Zhang4 and Zunlei Feng1,2,5 1State Key Laboratory of Blockchain and Data Security, Zhejiang University 2School of Software Technology, Zhejiang University 3Xuzhou Medical University 4The First Affiliated Hospital, College of Medicine, Zhejiang University 5Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security Corresponding Author. Email: EMAIL
Pseudocode No The paper contains mathematical equations and describes procedures in prose, but there are no clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/im Azhou/Dense SAM.
Open Datasets Yes To evaluate the performance of the proposed Dense SAM in dense segmentation of pathology images, we used three commonly used pathology datasets. Additionally, to demonstrate the applicability of Dense SAM in other dense segmentation domains, we also used three common remote sensing datasets, showcasing Dense SAM s robust dense segmentation capability. The six datasets are as follows: Pathology datasets: CPM17 [Vu et al., 2019]... Co NIC [Graham et al., 2021b]... Mo Nu Seg [Kumar et al., 2017]... Remote sensing datasets: WHU Building [Ji et al., 2018]... Inria Building [Maggiori et al., 2017]... Massachusetts Building [Mnih, 2013].
Dataset Splits Yes CPM17 [Vu et al., 2019] includes 32 / 32 pathology images for train / test, with sizes of 500 500 or 600 600 pixels. We resize each image to 1024 1024 pixels, and then crop it into 512 512 pixel patches with no overlap. Co NIC [Graham et al., 2021b] consists of 4981 image patches, each sized 256 256. Following Bo Nu S [Lin et al., 2024] we randomly split all images into 7:1:2 ratio, resulting in 3486 / 997 / 498 for train / validation / test. Mo Nu Seg [Kumar et al., 2017] comprises 44 images with each of size 1000 1000 pixels. The dataset has 30 / 14 images for train / test. We randomly split the train subset into 24 / 6 images for train / validation. Each image is resized to 1024 1024, and then crop to 256 256 without overlap. Remote sensing datasets: WHU Building [Ji et al., 2018] has a ground resolution of 0.3 meters and an image size of 512 512 pixels. It contains 4736 / 1036 / 2416 images for train / validation /test. Inria Building [Maggiori et al., 2017] contains 360 images collected from five cities at a 30cm resolution. We process it consistent with UANet [Li et al., 2024] and crop them in to 512 512 pixels, resulting in 9737 / 1942 images for train / validation. Massachusetts Building [Mnih, 2013] owns 151 aerial images with spatial resolution 1 meters and an image size of 1500 1500 pixels. we crop the images into 500 500 pixels, get 1233 / 36 / 90 images for train / validation / test.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions using the ViT-H type for the image encoder, which is a model architecture.
Software Dependencies No The paper mentions using Adam optimizer and specific loss functions (Dice loss, BCE loss, contrastive loss) but does not provide specific software dependencies or library versions (e.g., Python version, PyTorch version, CUDA version).
Experiment Setup Yes For the loss function, we use a linear combination of Dice loss, BCE loss and contrastive loss. We adopt Adam optimizer and training ranges from 10 to 30 epochs.