reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Assessing Pre-Trained Models for Transfer Learning Through Distribution of Spectral Components

Authors: Tengxue Zhang, Yang Shu, Xinyang Chen, Yifei Long, Chenjuan Guo, Bin Yang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted comprehensive experiments across three benchmarks and two tasks including image classiﬁcation and object detection, demonstrating that our method achieves state-of-the-art performance in choosing proper pre-trained models from the model hub for transfer learning.
Researcher Affiliation	Academia	1School of Data Science and Engineering, East China Normal University 2School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods using mathematical formulations and conceptual steps but does not include any explicitly labeled pseudocode or algorithm blocks. For example, it outlines the calculation of classification and regression scores, but these are presented as equations and descriptive text, not structured algorithms.
Open Source Code	No	The paper does not provide an explicit statement about releasing its own source code, nor does it include a direct link to a code repository for the methodology described. It mentions using existing tools/models like YOLOv5/v8 by Ultralytics and Hugging Face, but not its own implementation.
Open Datasets	Yes	We adopt 11 widely-used datasets in classiﬁcation tasks, including FGVC Aircraft (Maji et al. 2013), Standford Cars (Krause et al. 2013), Food101 (Bossard, Guillaumin, and Van Gool 2014), Oxford-IIIT Pets (Parkhi et al. 2012), Oxford-102 Flowers (Nilsback and Zisserman 2008), Caltech101 (Fei-Fei, Fergus, and Perona 2004), CIFAR-10 (Krizhevsky, Hinton et al. 2009), CIFAR-100 (Krizhevsky, Hinton et al. 2009), VOC2007 (Everingham et al. 2010), SUN397 (Xiao et al. 2010), DTD (Cimpoi et al. 2014). We select ﬁve datasets that span different domains and sizes for object detection to evaluate our model selection metric, including Blood (Roboﬂow 2022), Fork Lift (Traore 2022), NFL (home 2022), Valorant Video Game (Magonis 2022), and CSGO Video Game (ASD 2022).
Dataset Splits	No	The paper mentions using widely-used datasets and refers to their fine-tuning performances being obtained from other sources. While standard splits are typically used for these datasets, the paper does not explicitly state the specific training/test/validation split percentages or sample counts for its own experiments in the main text.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. It focuses on the methodology and results without describing the computational environment.
Software Dependencies	No	The paper mentions using models from the PyTorch repository and Hugging Face for pre-trained models. However, it does not specify the version numbers for PyTorch, Python, or any other critical software libraries used in their implementation for reproducibility.
Experiment Setup	No	The paper states one parameter, 'We set G = 10 in our experiments.' However, it lacks crucial details regarding other experimental setup parameters such as learning rates, batch sizes, optimizers, number of epochs, or other system-level training configurations needed for reproducibility.