A Forward Approach for Sufficient Dimension Reduction in Binary Classification
Authors: Jongkyeong Kang, Seung Jun Shin
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a simulation study to evaluate the finite-sample performance of the w OPG method. We set πh = h/10, h = 1, ..., 9, and employ the Gaussian kernel K(x, x ) = exp{ x x 2/(2σ2)} for the RKHS with σ being the median of the pairwise distances between the predictors in the positive and negative classes (Jaakkola et al., 1999). ... Table 1 contains the averaged d(b B, B) over 100 independent repetitions under the models (I) (III) with different combinations of (n, p) {500, 1000} {10, 20}. In Table 1, one can observe that SAVE, p Hd, and DR developed under the regression context exhibit worse performance than the rest of the methods carefully designed for the binary response. ... We applied both w OPG-LR and w OPG-SVM to Breast Cancer Coimbra (BCC) data available at the UCI machine learning repository (https://archive.ics.uci.edu/ml/index.php). ... Finally, we conducted a validation study in order to evaluate the effect of SDR in terms of classification performance. Toward this, we randomly split the data into training and test sets ... These steps were repeated independently for a hundred times, and Figure 3 compares the boxplots of test error rates for different SDR methods. |
| Researcher Affiliation | Academia | Jongkyeong Kang EMAIL Department of Information Statistics Kangwon National University Gangwon-do, 24341, Korea and Department of Statistics Korea University Seoul, 02841, Korea Seung Jun Shin EMAIL Department of Statistics Korea University Seoul, 02841, Korea |
| Pseudocode | Yes | Appendix B. Computing Algorithms In this section, we suppress π for the sake of simplicity. Let α = (α 0 , ..., α n ) , and ωij = ws(xi xj). B.1 w OPG-LR For the w OPG-LR, we have the following objective function for (18). ... (followed by equations and descriptions of iterative updates) B.2 w OPG-SVM For the w OPG-SVM, (18) with the hinge loss can be equivalently written as ... (followed by equations and descriptions of iterative updates) |
| Open Source Code | No | The paper does not provide an explicit statement or link to its source code. While it mentions the Journal of Machine Learning Research and a CC-BY 4.0 license for the paper itself, this does not pertain to the implementation code of the methodology. A dataset repository link (UCI) is provided, but no code repository. |
| Open Datasets | Yes | We applied both w OPG-LR and w OPG-SVM to Breast Cancer Coimbra (BCC) data available at the UCI machine learning repository (https://archive.ics.uci.edu/ml/index.php). The BCC data contains breast cancer diagnosis results for 116 patients with nine continuous predictors including age, body mass index, and seven measurements from the blood test, i.e., glucose, insulin, homeostatic model assessment (HOMA), leptin, adiponectin, resistin, and monocyte chemoattractant protein-1 (MCP-1). See Hosni et al. (2019) for more details about the data. |
| Dataset Splits | Yes | Finally, we conducted a validation study in order to evaluate the effect of SDR in terms of classification performance. Toward this, we randomly split the data into training and test sets denoted by Dtr = {(ytr 1 , xtr 1 ), ..., (ytr 58, xtr 58)} and Dts = {(yts 1 , xts 1 ), ..., (yts 58, xts 58)}, respectively. We then applied various SDR methods to Dtr and obtained the estimated basis of SY |X denoted by b Btr. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as GPU or CPU models, or specific cloud instances. |
| Software Dependencies | No | The paper does not provide specific details about ancillary software, such as library names with version numbers, that are needed to replicate the experiment. |
| Experiment Setup | Yes | We set πh = h/10, h = 1, ..., 9, and employ the Gaussian kernel K(x, x ) = exp{ x x 2/(2σ2)} for the RKHS with σ being the median of the pairwise distances between the predictors in the positive and negative classes (Jaakkola et al., 1999). Tuning parameters λ and θℓ, ℓ= 0, 2, ..., p, and the bandwidth parameter s are chosen as described in Section 4.2. ... In this article, we set πh = h/(H + 1), h = 1, ..., H with H = 9, i.e., πh = h/10, h = 1, ..., 0, and thus δ = 0.1. |