Weakly Supervised Object Segmentation by Background Conditional Divergence

Authors: Hassan Baker, Matthew Emigh, Austin J. Brockmeier

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on side-scan and synthetic aperture sonar in which our approach succeeds compared to previous unsupervised segmentation baselines that were only tested on natural images. Furthermore, to show generality we extend our experiments to natural images, obtaining reasonable performance with our method... We evaluate the effectiveness of our segmentation approach on the following datasets... We firstly present results for the real datasets: AI4Shipwrecks in Section 6.1, and SAS-Clutter in Section 6.2, and CUB-200-2011 in Section 6.3. Then we present the results of the synthetic objects on real sonar images SAS+d Sprites in Section 6.4, and results on synthetic composite images Textures+MNIST/Fashion MNIST/d Sprites in Section 6.5.
Researcher Affiliation Academia Hassan Baker EMAIL Department of Electrical and Computer Engineering University of Delaware Newark, DE, 19716 USA. Matthew S. Emigh EMAIL Naval Surface Warfare Center Panama City Division Panama City, FL, 32407 USA. Austin J. Brockmeier EMAIL Department of Electrical and Computer Engineering, Department of Computer and Information Sciences University of Delaware Newark, DE, 19716 USA.
Pseudocode Yes Algorithm 1 Training Procedure Require: Model Mθ, distributions PX|Cb=c, PX|Cb =c, and PR|Cb=c for c {1, . . . , K}, batch size B
Open Source Code Yes The code for this work can be found at https://github.com/bakerhassan/WSOS.
Open Datasets Yes 1. Toy datasets created by combining canonical object datasets (e.g., MNIST, Fashion MNIST (Xiao et al., 2017), and d Sprites (Matthey et al., 2017)) with textures as backgrounds (Brodatz, 1966). ... 4. SAS+d Sprites consists of real SAS images of various seafloor types (Cobb & Zare, 2014) with artificial shapes from d Sprites (Matthey et al., 2017) as synthetic objects. 5. CUB-200-2011 consists of natural images of 200 species of birds (Wah et al., 2011).
Dataset Splits Yes The dataset is already divided 50/50 into training and test respecting the variety of sites. We further split the training data into 70% training and 30% validation. ... The data is divided into 70%, 20%, and 10% for training, validation, and testing, respectively. ... For CUB-200-211, ... there are 5994 train images and 5794 test images. We further split the training data into 70% training and 30% validation.
Hardware Specification Yes Only one GPU, a Tesla V100-SXM2-32GB, is used for training.
Software Dependencies No The paper mentions 'Py Torch Lightning (Falcon & The Py Torch Lightning team, 2019) framework' but does not specify a version number for PyTorch Lightning or any other software libraries with specific version numbers.
Experiment Setup Yes The AE architecture... We use the Adam optimizer to optimize the AE, with a fixed learning rate of 0.0001, and batch size of 4096. ... For AI4Shipwrecks and SAS-Clutter datasets, a nearby patch of seafloor without any object can be passed to the AE/CLE + Sinkhorn K-means to infer the background of a foreground object image. The entropic regularization parameter in the Sinkhorn algorithm was set to 0.01. ... We choose our segmentation model to be the U-Net architecture (Ronneberger et al., 2015b) with default hyperparameters. We choose D-adaptation optimizer to optimize U-Net (Defazio & Mishchenko, 2023). The learning rate for the U-Net is 1 and the batch size is 400. ... For the AI4Shipwrecks and CUB datasets, we run the training for 150 and 200 epochs, respectively. ... For SAS+d Sprites and Textures+MNIST/Fashion MNIST/d Sprites, we run the training for 50 and 100 epochs, respectively.