reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Modeling Object Dissimilarity for Deep Saliency Prediction

Authors: Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang, Seungryong Kim, Mathieu Salzmann, Sabine Süsstrunk

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As evidenced by our experiments, this consistently boosts the accuracy of the baseline networks, enabling us to outperform the state-of-the-art models on three saliency benchmarks, namely SALICON, MIT300 and CAT2000. Our experiments on the SALICON (Jiang et al., 2015), MIT1003 (Judd et al., 2009) and CAT2000 (Borji & Itti, 2015) benchmarks demonstrate that our approach consistently improves the results of the baseline saliency networks we build on
Researcher Affiliation	Academia	1School of Computer and Communication Sciences, EPFL, Switzerland, 2Department of Computer Science and Engineering, Korea University, South Korea
Pseudocode	No	The paper describes its methodology through architectural diagrams (Figure 3) and textual descriptions in Section 3 "Methodology", but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Our project page is at https://github.com/IVRL/Dis Sal. We will make our code publicly available. We implemented our approach using Pytorch and will make our code publicly available.
Open Datasets	Yes	We report the performance of our methods on three publicly available saliency detection benchmarks. We train our models on 10,000 images of the SALICON (Jiang et al., 2015) dataset, which consists of diverse context-rich images from the MS COCO dataset (Lin et al., 2014). We also ﬁne-tune our SALICON-trained models on the MIT1003 dataset (Judd et al., 2009). In addition, we ﬁne-tune our model on the CAT2000 (Borji & Itti, 2015) dataset.
Dataset Splits	Yes	The dataset contains 10,000 training, 5,000 validation, and 5,000 test images, which makes it the largest saliency detection dataset to date. We also ﬁne-tune our SALICON-trained models on the MIT1003 dataset (Judd et al., 2009), which consists of 1003 everyday scenes collected from Flickr and Label Me, and evaluate them on the commonly used validation partition of MIT1003, and on the oﬃcial MIT300 test set, which contains 300 natural images. In addition, we ﬁne-tune our model on the CAT2000 (Borji & Itti, 2015) dataset, which comprises 2000 training and 2000 test images organized in 20 diverse categories. For CAT2000, we use 125 and 50 images across 20 categories to ﬁne-tune and validate our model, respectively.
Hardware Specification	Yes	We use two V100, 7 Tﬂops GPUs with 32 GB memory.
Software Dependencies	No	The paper mentions implementing the approach using Pytorch and using the Adam optimizer, but it does not specify any version numbers for these software components or any other libraries.
Experiment Setup	Yes	During training, we resize all images to 480x640 for the global saliency prediction branch and 300x300 for the object detection one. We use random orthogonal initialization for the decoder layers. Furthermore, we use the Adam optimizer to train the global saliency branch, with an initial learning rate of 10 4. We set the batch size to 2. We validate the network after each epoch and select the best model from the validation phase to avoid over-ﬁtting. When ﬁne-tuning on MIT1003, we use a batch size of 2 and an initial learning rate of 10 5. We also initialize our global saliency branch based on the current state-of-the-art model on the MIT/Tuebingen benchmark, namely UNISAL (Droste et al., 2020), with parameters provided by the authors of (Droste et al., 2020).