Incorporating Unlabeled Data into Distributionally Robust Learning
Authors: Charlie Frogner, Sebastian Claici, Edward Chien, Justin Solomon
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We examine the performance of this new formulation on 14 real data sets and find that it often yields effective classifiers with nontrivial performance guarantees in situations where conventional DRL produces neither. |
| Researcher Affiliation | Academia | Charlie Frogner EMAIL Sebastian Claici EMAIL Edward Chien EMAIL Justin Solomon EMAIL Computer Science & Artificial Intelligence Laboratory (CSAIL) Massachusetts Institute of Technology Cambridge, MA 02139, USA |
| Pseudocode | Yes | Algorithm 1 SGD for distributionally robust learning with unlabeled data Algorithm 2 SGD for distributionally robust active learning |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | In the experiments described in Sections 5.2, 5.1, and 6.3, we use 14 real data sets, taken from the UCI repository (Dua and Graff, 2019). |
| Dataset Splits | No | The paper describes sampling strategies like 'we sample a small number Nl of labeled examples' (Appendix D.1) or 'sample an initial set ˆZl of 20 labeled examples, with the remaining samples forming the unlabeled set ˆ Xu' (Appendix D.5). However, it does not provide specific train/test/validation dataset splits with exact percentages, sample counts, or citations to predefined splits that would allow for precise reproduction of a fixed data partitioning. |
| Hardware Specification | Yes | Average wall clock time per iteration is 0.022 seconds on a Xeon E5-2690. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer (Kingma and Ba, 2014)' but does not specify any software versions for libraries, programming languages, or other tools used in their implementation. |
| Experiment Setup | Yes | We solve (37) via its dual (Section 3.3), using the Adam optimizer (Kingma and Ba, 2014) with β1 = 0.9, β2 = 0.999, ϵ = 10−8, and a batch size of 100 and decreasing the learning rate by a factor of 8 every 10000 steps. In all experiments we use the transport cost c((x, y), (x', y')) = ||x - x'||^2 + κ|y - y'| with κ = 1 and y in {+1, -1} R. |