reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Where to Pay Attention in Sparse Training for Feature Selection?

Authors: Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed extensive experiments on 10 datasets of different types, including image, speech, text, artificial, and biological. They cover a wide range of characteristics, such as low and high-dimensional feature spaces, and few and large training samples.
Researcher Affiliation	Academia	Ghada Sokar Eindhoven University of Technology EMAIL Zahra Atashgahi University of Twente EMAIL Mykola Pechenizkiy Eindhoven University of Technology EMAIL Decebal Constantin Mocanu University of Twente Eindhoven University of Technology EMAIL
Pseudocode	Yes	Algorithm 1 WAST
Open Source Code	Yes	Code is available at https://github.com/Ghada Sokar/WAST.
Open Datasets	Yes	We evaluate our method on 10 publicly available datasets, including image, speech, text, time series, biological, and artificial data. They have a variety of characteristics, such as low and high-dimensional features and a small and large number of training samples. Details are in Table 1. (Table 1 lists datasets like Madelon [24], USPS [32], MNIST [36] with citations).
Dataset Splits	No	The paper provides train and test splits in Table 1 (e.g., 'Train' and 'Test' columns). However, it does not explicitly mention or quantify a separate 'validation' split used for hyperparameter tuning or model selection, which is distinct from the training and testing sets.
Hardware Specification	No	NN-based and classical methods are trained on Nvidia GPUs and CPUs, respectively.
Software Dependencies	No	We implemented WAST and QS [4] with Py Torch [58]
Experiment Setup	Yes	For all NN-based methods except CAE [5], we use a single hidden layer of 200 neurons. The architecture of CAE consists of two layers. The size of the hidden layers is dependent on the chosen K; [K, 3 2K]. For WAST and QS, we use a sparsity level of 0.8. Following [4], we report the accuracy of NN-based baselines after 100 epochs unless stated otherwise. ... For WAST, we train the model for 10 epochs. Following [4], we add a Gaussian noise with a factor of 0.2 to the input in WAST and QS [4]. Details of the hyperparameters are in Appendix A.1.