Importance Sampling for Minibatches

Authors: Dominik Csiba, Peter Richtárik

JMLR 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate on synthetic problems that for training data of certain properties, our sampling can lead to several orders of magnitude improvement in training time. We then test the new sampling on several popular data sets, and show that the improvement can reach an order of magnitude. 7. Experiments We now comment on the results of our numerical experiments, with both synthetic and real data sets. We plot the optimality gap P(w(t)) P(w ) and in the case of real data also the test error (vertical axis) against the computational effort (horizontal axis).
Researcher Affiliation Academia Dominik Csiba EMAIL School of Mathematics University of Edinburgh Edinburgh, United Kingdom. Peter Richt arik EMAIL School of Mathematics University of Edinburgh Edinburgh, United Kingdom.
Pseudocode Yes Algorithm 1 df SDCA Csiba and Richt arik (2015)
Open Source Code No The paper does not provide explicit statements about releasing source code for the methodology described, nor does it include a link to a code repository. Footnote 3 refers to datasets, not code.
Open Datasets Yes We used several publicly available data sets3, summarized in Table 5... Footnote 3: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits Yes We used several publicly available data sets3, summarized in Table 5, which we randomly split into a train (80%) and a test (20%) part.
Hardware Specification No The paper does not provide specific hardware details like CPU or GPU models used for the experiments. It focuses on 'computational effort' as an implementation-independent model for time.
Software Dependencies No The paper mentions 'Julia' in Table 2 for creating artificial data (e.g., 'L = rand(chisq(1),n)'), but it does not specify a version number for Julia or any other software libraries used for implementing the algorithms.
Experiment Setup Yes In all experiments we used the logistic loss: φi(z) = log(1+e yiz) and set the regularizer to λ = maxi X:i /n. ... The values of τ we used to plot are τ {1, 2, 4, 8, 16, 32}.