reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

EDGE: Unknown-aware Multi-label Learning by Energy Distribution Gap Expansion

Authors: Yuchen Sun, Qianqian Xu, Zitai Wang, Zhiyong Yang, Junwei He

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, comprehensive experimental results on multiple multi-label and OOD datasets reveal the effectiveness of the proposed method.
Researcher Affiliation	Academia	1Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, CAS, Beijing, China 2School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Energy Distribution Gap Expansion (EDGE) Input: ID training set Dtrain in, OE dataset DOE; Parameter: model parameter θ, epoch T, transform epoch τ, hyper-parameter α, β, k, m; 1: Compute dilation distance between Dtrain in and each individual set DOE by Eq. (9). 2: for t = 0, 1, . . . , T do 3: Sample a batch of ID and OE training data from Dtrain in and DOE respectively. 4: if t < τ then 5: Set β = 0. 6: end if 7: Calculate each loss using Eq. (4), (5), and (6). 8: Update parameter θ by minimizing (8). 9: end for
Open Source Code	No	The paper does not contain any explicit statement or link indicating the release of source code for the methodology described.
Open Datasets	Yes	We adopt three real-world benchmark multi-label image annotation datasets as our training sets, i.e., PASCAL-VOC (Everingham et al. 2015), MSCOCO (Lin et al. 2014) and NUS-WIDE (Chua et al. 2009). ... we choose the following benchmark OOD datasets as main experimental candidates: (a subset of) Image Net-22K (Deng et al. 2009), Textures (Cimpoi et al. 2014), Places50 (Zhou et al. 2018), i SUN (Xu et al. 2015), i Naturalist (Horn et al. 2018), LSUN-C (Yu et al. 2015), and SVHN (Netzer et al. 2011).
Dataset Splits	Yes	MS-COCO contains 122,218 annotation images with 80 categories, where 82,783 images are for training, 40,504 for validation, and 40,775 for testing; and NUS-WIDE covers 269,648 real-world images with 81 categories, where we reserve 119,986 images of the training set and 80,283 images of the test set, respectively.
Hardware Specification	Yes	All the experiments are run on NVIDIA Ge Force RTX 3090 implemented by Py Torch and conducted on the Res Net-50 (He et al. 2016) and VGG16 (Simonyan and Zisserman 2015) pre-trained on Image Net-1K (Deng et al. 2009).
Software Dependencies	No	All the experiments are run on NVIDIA Ge Force RTX 3090 implemented by Py Torch and conducted on the Res Net-50 (He et al. 2016) and VGG16 (Simonyan and Zisserman 2015) pre-trained on Image Net-1K (Deng et al. 2009). We use stochastic gradient descent (SGD) (Sutskever et al. 2013) with a momentum of 0.9 and a weight decay of 1e-4. For trivial BCE models, we use the Adam (Kingma and Ba 2015) to optimize.
Experiment Setup	Yes	We use stochastic gradient descent (SGD) (Sutskever et al. 2013) with a momentum of 0.9 and a weight decay of 1e-4. For trivial BCE models, we use the Adam (Kingma and Ba 2015) to optimize. The main hyper-parameters α, β, and m take value from {1e + 1, 1, 1e 1, 1e 2, 1e 3}, {1, 1e 1, 1e 2, 1e 3, 1e 4}, and {0, 1, 2, 3, 4, 5}.