reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Addressing Multi-Label Learning with Partial Labels: From Sample Selection to Label Selection

Authors: Gengyu Lyu, Bohang Sun, Xiang Deng, Songhe Feng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on various multi-label datasets demonstrate that our CLS is significantly superior to other state-of-the-art methods. Experiments Experiment Setup Datasets We employ three multi-label datasets, including MS-COCO (Lin et al. 2014), PASCAL VOC 2007 (Everingham et al. 2010) and NUS-WIDE (Chua et al. 2009).
Researcher Affiliation	Collaboration	Gengyu Lyu1,2,4, Bohang Sun1, Xiang Deng2,3, Songhe Feng*2,5 1 College of Computer Science, Beijing University of Technology 2 School of Computer Science and Technology, Beijing Jiaotong University 3 Department of Automation, Tsinghua University 4 Idealism Beijing Technology Co., Ltd. 5 Key Laboratory of Big Data & Artificial Intelligence in Transportation (Beijing Jiaotong University), Ministry of Education
Pseudocode	Yes	Algorithm 1 describes the training process of our proposed CLS, where we train two deep neural networks in a mini-batch manner.
Open Source Code	No	No explicit statement about releasing code or a link to a repository is provided in the paper.
Open Datasets	Yes	We employ three multi-label datasets, including MS-COCO (Lin et al. 2014), PASCAL VOC 2007 (Everingham et al. 2010) and NUS-WIDE (Chua et al. 2009). Since these datasets are fully annotated, we follow (Chen et al. 2021b) to randomly drop some positive labels to generate ML-PL datasets according to a dropping rate α, where α {25%, 50%, 75%} indicates the proportion of dropped positive labels.
Dataset Splits	Yes	We employ three multi-label datasets, including MS-COCO (Lin et al. 2014), PASCAL VOC 2007 (Everingham et al. 2010) and NUS-WIDE (Chua et al. 2009). Since these datasets are fully annotated, we follow (Chen et al. 2021b) to randomly drop some positive labels to generate ML-PL datasets according to a dropping rate α, where α {25%, 50%, 75%} indicates the proportion of dropped positive labels. Besides, we consider the extreme case of ML-PL, where each instance is annotated with only one relevant label.
Hardware Specification	Yes	We train our model in an end-to-end manner and accomplish all experiments on a computer with an Intel (R) Xeon (R) CPU E52620, 64 GB main memory, and two TITAN Xp GPUs.
Software Dependencies	No	The paper mentions that the method is implemented based on PyTorch, but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	For fair comparison, we adopt Resnet-50 (He et al. 2016) network pre-trained on Image Net (Deng et al. 2009) as feature extraction backbone for all methods. The input images are squished and randomly cropped into 224 224. Adam is used as the optimizer with a weight decay of 10 4. The batch size is set to 120 for all datasets. We run 100 epochs in total with an initial learning rate of 4 10 5 and decrease it to 0 using cosine decay. We adopt Binary Cross Entropy loss as our loss function. We set the ratio of selection rate Rh(e) and Rl(e) as follows: Rh(e) = 1 min{ e Ek τh, τh}, Rl(e) = 1 min{ e Ek τl, τl}, where τl = 0.02 for PASCAL VOC 2007 dataset, and τl = 0.06 for NUS-WIDE dataset and MS-COCO dataset. Ek is set to 10 for all datasets. The values of τh are {0.002, 0.003, 0.005, 0.005}, {0.005, 0.010, 0.012, 0.012}, and {0.006, 0.012, 0.020, 0.020} on PASCAL VOC 2007 dataset, MS-COCO dataset, and NUS-WIDE dataset under four configurations respectively, which are found through cross-validation.