reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Theory of Learning with Corrupted Labels

Authors: Brendan van Rooyen, Robert C. Williamson

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Here we begin to develop an abstract framework for tackling these problems. We present a generic method for learning from a ﬁxed, known, reconstructible corruption, along with an analyses of its statistical properties. We demonstrate the utility of our framework via concrete novel results in solving supervised learning problems wherein the labels are corrupted, such as learning with noisy labels, semi-supervised learning and learning with partial labels. The concrete contributions of this paper are: A general method for learning from corrupted labels based on a generalization of the method of unbiased estimators... Upper and lower bounds on the risk of learning from combinations of corrupted labels... Demonstration of the computational feasibility of our approach via the preservation of convexity...
Researcher Affiliation	Academia	Brendan van Rooyen EMAIL Robert C. Williamson EMAIL The Australian National University and Data61 Canberra ACT 2601, Australia
Pseudocode	No	The paper describes mathematical methods and algorithms using prose and mathematical notation (e.g., AERM( S) = arg min a A ℓR( S, a)), but does not contain any structured pseudocode blocks or algorithms labeled as such.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to any code repositories or supplementary materials for code.
Open Datasets	No	The paper primarily presents a theoretical framework and uses general problem types like 'Learning with Label Noise', 'Semi-Supervised Learning', and 'Learning with partial labels' as examples. It does not describe experiments performed on specific publicly available datasets.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments involving datasets. Therefore, there are no mentions of dataset splits (training, validation, test) or cross-validation setups.
Hardware Specification	No	The paper is theoretical and focuses on mathematical framework and proofs. It does not describe any computational experiments or mention specific hardware used for any part of the research.
Software Dependencies	No	The paper is theoretical and does not describe any computational experiments that would require specific software or library versions. Thus, no software dependencies are mentioned.
Experiment Setup	No	The paper is theoretical and develops a conceptual framework and mathematical bounds. It does not include any experimental results from a practical implementation, and therefore, no experimental setup details like hyperparameters, training configurations, or system-level settings are provided.