Nesterov's Acceleration for Approximate Newton

Authors: Haishan Ye, Luo Luo, Zhihua Zhang

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we will validate our theory empirically. We first compare accelerated regularized sub-sampled Newton (Algorithm 2) with regularized sub-sampled Newton (Algorithm 3 in Appendix called RSSN Roosta-Khorasani and Mahoney (2016)) on the ridge regression whose objective function is a quadratic function. Then we conduct more experiments on a popular machine learning problem called Ridge Logistic Regression, and compare accelerated regularized sub-sampled Newton with other classical methods.
Researcher Affiliation Academia Haishan Ye HSYE EMAIL Shenzhen Research Insititution of Big Data The Chinese University of Hong Kong, Shenzen 2001 Longxiang Road, Shenzhen, China Luo Luo EMAIL Department of Mathematics Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong Zhihua Zhang EMAIL National Engineering Lab for Big Data Analysis and Applications School of Mathematical Sciences Peking University 5 Yiheyuan Road, Beijing, China
Pseudocode Yes Algorithm 1 Accelerated Approximate Newton. 1: Input: x(0) and x(1) are initial points sufficiently close to x ; θ is the acceleration parameter. 2: for t = 1, . . . until termination do 3: Construct an approximate Hessian H(t) satisfying Eqn. (4); 4: y(t) = x(t) 1 θ 1+θ x(t) x(t 1) ; 5: x(t+1) = y(t) [H(t)] 1 F(y(t)). 6: end for
Open Source Code No The paper mentions that 'All algorithms except SVRG are implemented in MATLAB' and 'SVRG... we implement it with C++ language.' However, it does not provide an explicit statement about code release or any links to a code repository for the methodology described.
Open Datasets Yes We conduct our experiments on six datasets: gisette, sido0, svhn, rcv1, real-sim, and avazu. ... Table 4: Datasets summary (sparsity= #Non-Zero Entries / n d ). We use the first 2, 085, 163 training samples of the full avazu training set. Source gisette dense libsvm dataset sido0 dense Guyon svhn dense libsvm dataset rcv1 0.16% libsvm dataset real-sim 0.24% libsvm dataset avazu 0.0015% libsvm dataset
Dataset Splits No The paper describes using several public datasets but does not explicitly state how these datasets were split into training, validation, or test sets for the experiments.
Hardware Specification Yes All experiments are conducted on a laptop with Intel i7-7700HQ CPU and 8GB RAM.
Software Dependencies No The paper mentions that 'All algorithms except SVRG are implemented in MATLAB' and 'SVRG... we implement it with C++ language.' However, it does not provide specific version numbers for MATLAB, C++, or any other software libraries or dependencies used in the experiments.
Experiment Setup Yes In our experiments, we set the regularizer λ = 1. In the experiments, we set the sample size |S| to be 1%n, 5%n, and 10%n, respectively. The regularizer α of Algorithm 2 is properly chosen according to |S|. ARSSN and RSSN share the same sample size |S|. The acceleration parameter θ is appropriately selected and fixed. ... we try different settings of the regularizer λ, as 1/n, 10−1/n, and 10−2/n to represent different levels of regularization. Furthermore, in our experiments, we choose zero vector x = [0, . . . , 0]T Rd as the initial point.