Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Feature Selection in the Contrastive Analysis Setting

Authors: Ethan Weinberger, Ian Covert, Su-In Lee

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We motivate our approach with a novel information-theoretic analysis of representation learning in the CA setting, and we empirically validate CFS on a semi-synthetic dataset and four real-world biomedical datasets.
Researcher Affiliation Academia Ethan Weinberger Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 EMAIL Ian C. Covert Department of Computer Science Stanford University Stanford, CA 94305 EMAIL Su-In Lee Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 EMAIL
Pseudocode No The paper describes the proposed method (CFS) using text descriptions and mathematical equations, accompanied by a diagram (Figure 2), but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes An open-source implementation of our method is available at https://github.com/suinleelab/CFS.
Open Datasets Yes We validate our approach empirically through extensive experiments on a semi-synthetic dataset introduced in prior work as well as four real-world biomedical datasets... Raw data was downloaded from https://archive.ics.uci.edu/ml/machine-learning-databases/00342/.
Dataset Splits Yes For all experiments we divided our data using an 80-20 train-test split, and we report the mean standard error over five random seeds for each method.
Hardware Specification Yes All experiments were peformed on a system running Cent OS 7.9.2009 equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Software Dependencies Yes CFS models were implemented using Py Torch [50] (version 1.13) with the Py Torch Lightning API4...equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Experiment Setup Yes For all CFS variants we let our reconstruction function f be a multilayer perceptron with two hidden layers of size 512 with Re LU activation functions...All CFS models were trained using the Py Torch implementation of the Adam [51] optimizer with default hyperparameters. Batch sizes of 128 were used for all experiments.