Nesterov's Acceleration for Approximate Newton
Authors: Haishan Ye, Luo Luo, Zhihua Zhang
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we will validate our theory empirically. We first compare accelerated regularized sub-sampled Newton (Algorithm 2) with regularized sub-sampled Newton (Algorithm 3 in Appendix called RSSN Roosta-Khorasani and Mahoney (2016)) on the ridge regression whose objective function is a quadratic function. Then we conduct more experiments on a popular machine learning problem called Ridge Logistic Regression, and compare accelerated regularized sub-sampled Newton with other classical methods. |
| Researcher Affiliation | Academia | Haishan Ye HSYE EMAIL Shenzhen Research Insititution of Big Data The Chinese University of Hong Kong, Shenzen 2001 Longxiang Road, Shenzhen, China Luo Luo EMAIL Department of Mathematics Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong Zhihua Zhang EMAIL National Engineering Lab for Big Data Analysis and Applications School of Mathematical Sciences Peking University 5 Yiheyuan Road, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Accelerated Approximate Newton. 1: Input: x(0) and x(1) are initial points sufficiently close to x ; θ is the acceleration parameter. 2: for t = 1, . . . until termination do 3: Construct an approximate Hessian H(t) satisfying Eqn. (4); 4: y(t) = x(t) 1 θ 1+θ x(t) x(t 1) ; 5: x(t+1) = y(t) [H(t)] 1 F(y(t)). 6: end for |
| Open Source Code | No | The paper mentions that 'All algorithms except SVRG are implemented in MATLAB' and 'SVRG... we implement it with C++ language.' However, it does not provide an explicit statement about code release or any links to a code repository for the methodology described. |
| Open Datasets | Yes | We conduct our experiments on six datasets: gisette, sido0, svhn, rcv1, real-sim, and avazu. ... Table 4: Datasets summary (sparsity= #Non-Zero Entries / n d ). We use the first 2, 085, 163 training samples of the full avazu training set. Source gisette dense libsvm dataset sido0 dense Guyon svhn dense libsvm dataset rcv1 0.16% libsvm dataset real-sim 0.24% libsvm dataset avazu 0.0015% libsvm dataset |
| Dataset Splits | No | The paper describes using several public datasets but does not explicitly state how these datasets were split into training, validation, or test sets for the experiments. |
| Hardware Specification | Yes | All experiments are conducted on a laptop with Intel i7-7700HQ CPU and 8GB RAM. |
| Software Dependencies | No | The paper mentions that 'All algorithms except SVRG are implemented in MATLAB' and 'SVRG... we implement it with C++ language.' However, it does not provide specific version numbers for MATLAB, C++, or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | In our experiments, we set the regularizer λ = 1. In the experiments, we set the sample size |S| to be 1%n, 5%n, and 10%n, respectively. The regularizer α of Algorithm 2 is properly chosen according to |S|. ARSSN and RSSN share the same sample size |S|. The acceleration parameter θ is appropriately selected and fixed. ... we try different settings of the regularizer λ, as 1/n, 10−1/n, and 10−2/n to represent different levels of regularization. Furthermore, in our experiments, we choose zero vector x = [0, . . . , 0]T Rd as the initial point. |