Sparse-Input Neural Network using Group Concave Regularization

Authors: Bin Luo, Susan Halabi

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These results are supported by extensive simulation studies and real data applications, which demonstrate the finite-sample performance of the estimator in feature selection and prediction across continuous, binary, and time-to-event outcomes.
Researcher Affiliation Academia Bin Luo EMAIL School of Data Science and Analytics Kennesaw State University Marietta, GA 30060, USA Susan Halabi EMAIL Department of Biostatistics and Bioinformatics Duke University Durham, NC 27708, USA
Pseudocode No The paper describes the composite gradient descent algorithm and summarizes the calculation steps for each epoch but does not present them in a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Python code and examples for the proposed group concave regularized neural networks are available at https://github.com/r08in/GCRNN.
Open Datasets Yes The data from CALGB 90401 is available from the NCTN Data Archive at https://nctn-data-archive.nci.nih.gov/. The MNIST dataset was downloaded using the built-in torchvision.datasets.MNIST interface in Py Torch, which automatically retrieves the original dataset from Yann Le Cun s repository.
Dataset Splits Yes In all numerical studies presented in this paper, we adopted a 20% holdout validation set from the training data. ... We randomly split the dataset 100 times into training sets (n=526) and testing sets (n=105) using a 5:1 allocation ratio.
Hardware Specification No The paper mentions 'powerful computational resources' but does not specify any particular GPU models, CPU types, or other hardware details used for the experiments.
Software Dependencies No The paper mentions 'Python code', 'Adam optimizer', and 'Py Torch' but does not provide specific version numbers for any of these software components or libraries.
Experiment Setup Yes For a fair comparison across all neural network methods, we used a Re LU-activated multi-layer perceptron (MLP) with two hidden layers of 10 and 5 units, respectively. ... Specifically, we set the scaling factor γ = 1 in the thresholding operator and used a base learning rate of LR = 0.001 for Adam. ... The parameter search ranges are displayed in Table 3. We set λ = 0 for NN and Oracle-NN to deactivate feature selection. For GLASSONet, GMCPNet, and GSCADNet, the number of epochs at λmin was set to 2000 for the LD and 200 for the HD scenarios. For all other values of λ, the number of epochs was set to 200 for both LD and HD settings. The number of epochs for NN was consistently fixed at 5000. ... The network weights were initialized by sampling from a Gaussian distribution with mean 0 and standard deviation 0.1, while the bias terms were set to 0 following the Xavier initialization technique (Glorot & Bengio, 2010).