reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Prediction Performance through Influence Measure

Authors: Shuguang Yu, Wenqian Xu, Xinyi Zhou, Xuechun Wang, Hongtu Zhu, Fan Zhou

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of these methods is demonstrated through extensive simulations and real-world datasets. In this section, we present experimental results that illustrate how our newly proposed metrics, FIutil and FIactive, contribute to enhancing model prediction performance. We compare our algorithms with state-of-the-art strategies in both data trimming and active learning scenarios, thereby validating the effectiveness of our approach on both simulated and real-world datasets.
Researcher Affiliation	Academia	Shuguang Yua , Wenqian Xua, Xinyi Zhoua, Xuechun Wanga, Hongtu Zhub, and Fan Zhoua a Shanghai University of Finance and Economics, Shanghai, China b University of North Carolina at Chapel Hill, North Carolina, USA
Pseudocode	Yes	Algorithm 1 Calculation of FIutil; Algorithm 2 Calculation of FIactive; Algorithm 3 Data Trimming; Algorithm 4 Active Learning; Algorithm 5 Calculation of FIutil approx; Algorithm 6 Data Trimming with Approximation; Algorithm 7 Calculation of FIactive approx; Algorithm 8 Active Learning with Approximation
Open Source Code	No	The paper discusses methods and experiments but does not provide an explicit statement about releasing its own source code nor a link to a repository for the methodology described in this paper.
Open Datasets	Yes	Adult. Access Link: Adult Database; Bank. Access Link: Bank Database; Celeb A. Access Link: Celeb A Database; Jigsaw Toxicity. Access Link: Jigsaw Toxicity Database; MNIST. Access Link: MNIST Database; EMNIST. Access Link: EMNIST Database; CIFAR10. Access Link: CIFAR10 Database; Office-31. Access Link: Office-31 Dataset; AG News. Access Link: AG News Dataset
Dataset Splits	Yes	Validation on 2D Linear Model. Each dataset comprises 150 training samples, 100 validation samples, and 600 test samples. Validation on 2D Nonlinear Model. Each dataset comprises 500 training samples, 250 validation samples, and 250 test samples. Adult. This dataset contains 37,692 samples, with 30,162 for training and 7,530 for testing. ... we randomly sample 5,000 instances from the original training set to serve as the new training set, and similarly, 4,200 instances from the original test set to serve as the new test set.
Hardware Specification	No	The paper discusses computational complexity and performance but does not provide specific hardware details such as GPU/CPU models or memory used for running its experiments.
Software Dependencies	No	All differentiation operations can be easily computed using backpropagation (Goodfellow et al., 2016) in deep learning libraries such as Tensor Flow (Abadi et al., 2016) and Py Torch (Paszke et al., 2017). However, specific version numbers for these libraries are not provided.
Experiment Setup	Yes	Table 3: Parameter Settings of Data Trimming. Adult, Bank, Celeb A, Jigsaw Toxicity: optimizer SGD/Adam, learning rate 1e-2/1e-4, weight decay 1e-2/1e-6. Adult +noise, Bank +noise, Celeb A +noise, Jigsaw Toxicity +noise: optimizer SGD/Adam, learning rate 1e-2/1e-1, weight decay 1e-2/1e-4.