Loss factorization, weakly supervised learning and label noise robustness
Authors: Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni
ICML 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The theory is validated by experiments in which we call the adapted SGD as a black box. ... We analyze experimentally the theory so far developed. ... The next results are based on UCI data. We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). We measure dclean and RD,01, injecting symmetric label noise p [0, 0.45) and averaging over 25 runs. ... We conclude with a systematic study of hold-out error of µSGD. The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. We compare with vanilla SGD run on corrupted sample S and measure the gain from estimating ˆµ S. ... Table 2 reports test error for SGD and µSGD over 25 trials of artificially corrupted datasets. |
| Researcher Affiliation | Collaboration | Giorgio Patrini1,2 EMAIL Frank Nielsen3,4 EMAIL Richard Nock2,1 EMAIL Marcello Carioni5 EMAIL Australian National University1, Data612, Ecole Polytechnique3, Sony Computer Science Laboratories Inc4, Max Planck Institute for Mathematics in the Sciences5 |
| Pseudocode | Yes | Algorithm 1 µSGD... Algorithm 2 µSGD applied on noisy labels |
| Open Source Code | No | No statement or link is provided regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The next results are based on UCI data. |
| Dataset Splits | Yes | The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned for the experimental setup. |
| Software Dependencies | No | We learn with logistic loss... The learning rate η is untouched from Shalev-Shwartz et al. (2011)... We learn with λ = 10 6 by standard square loss. |
| Experiment Setup | Yes | We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). ... we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. |