reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Identifying and Reusing Learnwares Across Different Label Spaces

Authors: Jian-Dong Liu, Zhi-Hao Tan, Zhi-Hua Zhou

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments with thousands of heterogeneous models validate our approach, demonstrating that reusing identified learnware combinations can outperform both training from scratch and fine-tuning a generic pre-trained model.
Researcher Affiliation	Academia	Jian-Dong Liu1,2 , Zhi-Hao Tan1,2 and Zhi-Hua Zhou1,2 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China EMAIL
Pseudocode	Yes	Algorithm 1 Specification Generation ... Algorithm 2 Multiple Learnware Identification
Open Source Code	No	The paper does not explicitly provide a link to the source code for the methodology described, nor does it contain an unambiguous statement of code release for this specific work.
Open Datasets	Yes	For tabular scenarios, we develop 745 learnwares, each with a unique label space, derived from 10 real-world datasets on the Open ML [Vanschoren et al., 2013] platform... For image scenarios, we develop 300 heterogeneous learnwares from 12 real-world datasets: Euro SAT [Helber et al., 2019], SVHN [Netzer et al., 2011], CIFAR10/100 [Krizhevsky and Hinton, 2009], AID [Xia et al., 2017], Pets [Parkhi et al., 2012], Resisc45 [Cheng et al., 2017], DTD [Cimpoi et al., 2014], Food [Bossard et al., 2014], Caltech101 [Li et al., 2004], Flowers [Nilsback and Zisserman, 2008], and CUB2011 [Wah et al., 2011].
Dataset Splits	Yes	For each dataset, we test user tasks covering all classes, with each task Du including the test set and 10 labeled data per class from the training set. ... we test cases where labeled data per class increases from 10 to 5,000, or all available data if fewer than 5,000.
Hardware Specification	No	The paper does not explicitly provide details about the specific hardware (e.g., GPU models, CPU types) used to run the experiments.
Software Dependencies	No	The paper mentions using 'Light GBM' and 'SGD optimizer' but does not specify version numbers for these or any other software libraries or programming languages used for implementation.
Experiment Setup	Yes	We set the specification size n to 10 and use a Gaussian kernel k(x1, x2) = exp( γ x1 x2 2 2) with γ = 0.1. For learnware search, K is chosen from {2, 3, 4} for different datasets, with constants M = 5 and λ = 100. ... training Res Net50 [He et al., 2016] models for 100 epochs using the SGD optimizer, with the initial learning rate chosen from {0.1, 0.01, 0.001} and cosine annealing learning rate decay.