SAND: One-Shot Feature Selection with Additive Noise Distortion

Authors: Pedram Pad, Hadi Hammoud, Mohamad Dia, Nadim Maamari, Liza Andrea Dunbar

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct an extensive benchmarking study against state-of-the-art feature selection methods using common datasets and a novel real-world dataset, showcasing our method s effective competition against existing approaches. . . . Test metrics on 9 datasets over 10 trials. The metric is accuracy ( ) for all except MAE ( ) for CA Housing being a regression problem.
Researcher Affiliation Collaboration 1CSEM, Neuchâtel, Switzerland 2EPFL, Lausanne, Switzerland. Correspondence to: Pedram Pad <EMAIL>.
Pseudocode No The paper describes the mathematical model of the SAND layer using equations (1), (2), and (3) and explains its mechanism in paragraph text, but it does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code Yes The code to reproduce our experiments is available at https://github.com/csem/SAND
Open Datasets Yes Specifically, we utilized nine datasets, seven of which were used in previous studies by (Balın et al., 2019; Lemhadri et al., 2021; Yamada et al., 2020; Yasuda et al., 2023). The additional real and synthetic datasets were California Housing (Torgo, 1997) and HAR70 (Logacjov & Ustad, 2023). . . . The MSI Grain dataset. . . is available in the code repository at https://github.com/csem/SAND.
Dataset Splits Yes Across all experiments, we employed the Adam optimizer with a learning rate of 10 3, and we partitioned the datasets into 70-10-20 splits for training, validation, and testing, respectively.
Hardware Specification Yes Moreover, the experiments were executed on a machine equipped with an NVIDIA Ge Force RTX 4090 GPU with 24GB of RAM, paired with an AMD Ryzen 9 5900X 12-Core Processor featuring 24 threads.
Software Dependencies No The paper mentions the 'Adam optimizer' but does not specify version numbers for any software libraries, programming languages, or other dependencies required to replicate the experiment.
Experiment Setup Yes Across all experiments, we employed the Adam optimizer with a learning rate of 10 3. . . . For hyperparameters of the SAND layer, we used σ = 1.5 and α = 2 consistently. Unless otherwise specified, we selected k = 60 features for all datasets by default, except for the following: k = 5 for the Madelon dataset, k = 3 for the CA Housing dataset, and k = 6 for the Har70 dataset. . . . Table 3 containing details about all datasets utilized in the feature selection experiments. Additionally, the table includes the epochs employed during training for each dataset. . . and batch size used for training.