reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Data Valuation in the Absence of a Reliable Validation Set

Authors: Himanshu Jahagirdar, Jiachen T. Wang, Ruoxi Jia

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical Evaluation. We demonstrate the eﬀectiveness of LOOCV-based data valuation techniques on important downstream tasks. Compared with validation-based techniques, we show that LOOCV-based data valuation techniques achieve comparable performance on the weighted accuracy task and (often) superior performance on noisy label detection task. We also show that RLS with Gaussian kernel as a proxy model is an eﬀective proxy model for valuation, the computed data value scores have a better performance on these downstream tasks than the validation-based counterparts.
Researcher Affiliation	Academia	Himanshu Jahagirdar EMAIL Virginia Tech Jiachen T. Wang EMAIL Princeton University Ruoxi Jia EMAIL Virginia Tech
Pseudocode	No	The paper describes methods using mathematical formulations and prose, but does not contain any clearly labeled pseudocode or algorithm blocks. For example, Section 4 details the proposed approach and its components but without a dedicated algorithm box.
Open Source Code	No	The paper does not contain any explicit statement about providing source code, nor does it include links to a code repository in the main text or supplementary materials. Phrases like 'We release our code...' or links to GitHub are absent.
Open Datasets	Yes	We evaluate data values over 9 classiﬁcation datasets popularly used in data valuation literature (refer Appendix B.1). For example, the paper mentions using the 'Census Dataset from the UCI Repository (Dua & Graﬀ, 2017)', 'Credit Card Data (Yeh & Lien, 2009)', 'CIFAR10', and 'MNIST'.
Dataset Splits	Yes	A validation-free paradigm for data valuation using Leave-One-Out Cross-Validation (LOOCV). Recognizing the limitations of validation-based data valuation techniques, we propose a novel validation-free approach using Leave-One-Out Cross-Validation (LOOCV) to estimate performance scores on the population. Cross-Validation (CV) is a widely-used technique in statistical machine learning for estimating the generalizability of a trained model to the population distribution. In a K-fold CV, data is randomly partitioned into K equal-sized subsets. The model is trained on K 1 subsets and tested on the remaining one, repeating this process K times and averaging the validation performance over the remaining subset. Leave-one-out cross-validation (LOOCV) is a special case of K-fold where K equals the total sample size. That is, it trains the model on all data points except one, and repeats this for each data point.
Hardware Specification	No	The paper makes a general statement about future potential, 'Additionally, it opens the potential for parallel computation of f i for all i via GPU operations', but does not specify any actual hardware (like CPU, GPU models, or cloud configurations) used for the experiments presented in the paper.
Software Dependencies	No	The paper mentions using 'standard models (either binary MLP or logistic regression)' and an 'RLS model' but does not specify any software frameworks, libraries, or their version numbers (e.g., 'PyTorch 1.9', 'Scikit-learn 0.24') that would be necessary for reproducibility.
Experiment Setup	Yes	LOOCV calculation (outlined in Theorem 2) involves computing the eﬃcient cross-validation accuracy (using Theorem 5) on an RLS model ( = 0.1) with a Gaussian Kernel. Additionally, we perform an ablation study on the eﬀect of changing parameter in Appendix B.6. In all experiments, labels have been randomly ﬂipped with a ﬁxed poison ratio of 10%.