reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Knockout: A simple way to handle missing inputs

Authors: Minh Nguyen, Batuhan K. Karaman, Heejong Kim, Alan Q. Wang, Fengbei Liu, Mert R. Sabuncu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Knockout across a wide range of simulations and real-world datasets and show that it offers strong empirical performance.
Researcher Affiliation	Academia	Minh Nguyen EMAIL Cornell University Batuhan K. Karaman EMAIL Cornell University Heejong Kim EMAIL Weill Cornell Medicine Alan Q. Wang EMAIL Cornell University Fengbei Liu EMAIL Cornell University Mert R. Sabuncu EMAIL Cornell University and Weill Cornell Medicine for the Alzheimer s Disease Neuroimaging Initiative
Pseudocode	No	The paper describes the Knockout method and its theoretical justification using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps formatted like code.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to any code repositories.
Open Datasets	Yes	Data used in preparation of this article were obtained from the Alzheimer s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). We use data from the Alzheimer s Disease Neuroimaging Initiative (ADNI) database (Mueller et al., 2005) and adopt the state-of-the-art model from (Karaman et al., 2022). We evaluate model performance on CIFAR-10H (Peterson et al., 2019) and CIFAR-10/100N (Wei et al., 2021). In particular, we experiment on a multi-modal tumor segmentation task (Baid et al., 2021). The Auto Arborist dataset (Beery et al., 2022), a multi-view (street and aerial) image dataset, is used for this purpose. The dataset consists of T2-weighted (T2w), diffusion-weighted (DWI) and apparent diffusion coefficient (ADC) MR images per subject (Saha et al., 2022). To further evaluate the generalizability of our approach, we extend our experiments to the UPMC Food-101 dataset (Wang et al., 2015)
Dataset Splits	Yes	In each simulation, we sample 30k data points in total and use 10% for training. These results are averaged over 10 random 80-20 train-test splits. We follow the standard train-test split, reserving 7,000 samples from the training set for validation.
Hardware Specification	No	The paper does not specify any particular GPU models, CPU models, or other hardware components used for running the experiments.
Software Dependencies	No	The paper mentions 'Adam (Kingma & Ba, 2014)' as an optimizer and 'Wide-Res Net-10-28 (Zagoruyko & Komodakis, 2016)', 'Res Net-50 (He et al., 2016)', 'Vi T-B-16 (Dosovitskiy et al., 2021)', '3D UNet (Ronneberger et al., 2015)', 'Res Net50 (vision)', and 'Ro BERTa-base (text)' as models, but it does not specify any software libraries or packages with their version numbers required for replication.
Experiment Setup	Yes	All methods use the same neural network architecture composed of a 3-layer multi-layer perceptron (MLP) with hidden layers 100 and Re LU activations. Training is done using Adam (Kingma & Ba, 2014) with learning rate 3e-3 for 5k steps. We minimize a sum of cross-entropy loss and Dice loss with equal weighting and use Adam optimizer with a learning rate of 1e-3. Models are trained using Adam W with a learning rate of 3 10 5, for 8 epochs and a batch size of 72.