reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Feature Selection in the Contrastive Analysis Setting

Authors: Ethan Weinberger, Ian Covert, Su-In Lee

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We motivate our approach with a novel information-theoretic analysis of representation learning in the CA setting, and we empirically validate CFS on a semi-synthetic dataset and four real-world biomedical datasets.
Researcher Affiliation	Academia	Ethan Weinberger Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 EMAIL Ian C. Covert Department of Computer Science Stanford University Stanford, CA 94305 EMAIL Su-In Lee Paul G. Allen School of Computer Science University of Washington Seattle, WA 98195 EMAIL
Pseudocode	No	The paper describes the proposed method (CFS) using text descriptions and mathematical equations, accompanied by a diagram (Figure 2), but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	An open-source implementation of our method is available at https://github.com/suinleelab/CFS.
Open Datasets	Yes	We validate our approach empirically through extensive experiments on a semi-synthetic dataset introduced in prior work as well as four real-world biomedical datasets... Raw data was downloaded from https://archive.ics.uci.edu/ml/machine-learning-databases/00342/.
Dataset Splits	Yes	For all experiments we divided our data using an 80-20 train-test split, and we report the mean standard error over five random seeds for each method.
Hardware Specification	Yes	All experiments were peformed on a system running Cent OS 7.9.2009 equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Software Dependencies	Yes	CFS models were implemented using Py Torch [50] (version 1.13) with the Py Torch Lightning API4...equipped with an NVIDIA RTX 2080 TI GPU with CUDA 11.7.
Experiment Setup	Yes	For all CFS variants we let our reconstruction function f be a multilayer perceptron with two hidden layers of size 512 with Re LU activation functions...All CFS models were trained using the Py Torch implementation of the Adam [51] optimizer with default hyperparameters. Batch sizes of 128 were used for all experiments.