Dual Iterative Hard Thresholding
Authors: Xiao-Tong Yuan, Bo Liu, Lezi Wang, Qingshan Liu, Dimitris N. Metaxas
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical results confirm our theoretical predictions and demonstrate the superiority of DIHT to the state-of-the-art primal IHT-style algorithms in model estimation accuracy and computational efficiency. 5. Experiments In this section we present numerical study for theory verification and algorithm evaluation. In the theory verification part, we conduct simulations on sparse linear regression problems to verify the strong/approximate sparse duality theorems established in Section 3. Then in the algorithm evaluation part, we run experiments on synthetic and real data sets to evaluate the numerical performance of DIHT and SDIHT when applied to sparse linear regression and hinge loss minimization tasks. |
| Researcher Affiliation | Collaboration | Xiao-Tong Yuan EMAIL B-DAT Lab and CICAEET, Nanjing University of Information Science and Technology Nanjing, 210044, China Bo Liu EMAIL JD Finance America Corporation Mountain View, CA 94043, USA Lezi Wang EMAIL Department of Computer Science, Rutgers University Piscataway, NJ 08854, USA |
| Pseudocode | Yes | Algorithm 1: Dual Iterative Hard Thresholding (DIHT) Algorithm 2: Stochastic Dual Iterative Hard Thresholding (SDIHT) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It only mentions data sets availability. |
| Open Datasets | Yes | Two binary benchmark data sets from Lib SVM data repository, RCV1 (d = 47, 236) (Lewis et al., 2004) and News20 (d = 1, 355, 191) (Lang, 1995),4 are used for algorithm efficiency evaluation and comparison. For the RCV1 data set, we select N = 500, 000 (N d) samples for model training and the rest 197, 641 samples for testing. For the News20 data set, we use N = 15, 000 (d N) samples for training and the left 4, 996 samples are used as test data. 4. These data sets are available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html. |
| Dataset Splits | Yes | For the RCV1 data set, we select N = 500, 000 (N d) samples for model training and the rest 197, 641 samples for testing. For the News20 data set, we use N = 15, 000 (d N) samples for training and the left 4, 996 samples are used as test data. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | For this simulation study, we test with two baseline dimensionality-sparsity configurations (d, k) {(30, 5), (500, 50)}. regularization strength λ, and noise level σ λ = λ0/N^(1/2) for λ0 {0.1, 1, 10} regularization strength λ = λ0/N^(1/2) for λ0 {1, 10}. For the two stochastic algorithms SDIHT and SVR-GHT, the training data is uniformly randomly divided into mini-batches with batch size 10 . The regularization parameters are set to be λ = λ0/N^(1/2), λ0 = {0.4, 1.2, 2}, respectively. In this non-smooth case, we set the step-size in DIHT and SDIHT to be η(t) = c/(t+2), where c is a constant determined by grid search for optimal efficiency. |