reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Loss factorization, weakly supervised learning and label noise robustness

Authors: Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The theory is validated by experiments in which we call the adapted SGD as a black box. ... We analyze experimentally the theory so far developed. ... The next results are based on UCI data. We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). We measure dclean and RD,01, injecting symmetric label noise p [0, 0.45) and averaging over 25 runs. ... We conclude with a systematic study of hold-out error of µSGD. The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. We compare with vanilla SGD run on corrupted sample S and measure the gain from estimating ˆµ S. ... Table 2 reports test error for SGD and µSGD over 25 trials of artiﬁcially corrupted datasets.
Researcher Affiliation	Collaboration	Giorgio Patrini1,2 EMAIL Frank Nielsen3,4 EMAIL Richard Nock2,1 EMAIL Marcello Carioni5 EMAIL Australian National University1, Data612, Ecole Polytechnique3, Sony Computer Science Laboratories Inc4, Max Planck Institute for Mathematics in the Sciences5
Pseudocode	Yes	Algorithm 1 µSGD... Algorithm 2 µSGD applied on noisy labels
Open Source Code	No	No statement or link is provided regarding the availability of open-source code for the described methodology.
Open Datasets	No	The next results are based on UCI data.
Dataset Splits	Yes	The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned for the experimental setup.
Software Dependencies	No	We learn with logistic loss... The learning rate η is untouched from Shalev-Shwartz et al. (2011)... We learn with λ = 10 6 by standard square loss.
Experiment Setup	Yes	We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). ... we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set.