reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TabCBM: Concept-based Interpretable Neural Networks for Tabular Data

Authors: Mateo Espinosa Zarlenga, Zohreh Shams, Michael Edward Nelson, Been Kim, Mateja Jamnik

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method in both synthetic and real-world tabular tasks and show that Tab CBM outperforms or performs competitively compared to state-of-the-art methods, while providing a high level of interpretability as measured by its ability to discover known high-level concepts.
Researcher Affiliation	Collaboration	Mateo Espinosa Zarlenga EMAIL Department of Computer Science and Technology University of Cambridge Zohreh Shams EMAIL Department of Computer Science and Technology University of Cambridge Michael Edward Nelson EMAIL Keyrock European Bioinformatics Institute University of Cambridge Been Kim EMAIL Google Deep Mind Mateja Jamnik EMAIL Department of Computer Science and Technology University of Cambridge
Pseudocode	No	The paper describes the model architecture and algorithm steps using mathematical notation and descriptive text, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	All of the code needed to reproduce our results, and use our model through a simple API, has been made public at https://github.com/mateoespinosa/tabcbm via an MIT license.
Open Datasets	Yes	Datasets We evaluate our method on both synthetic and real-world tabular datasets. We construct four synthetic tabular datasets of increasing complexity: Synth-Linear, Synth-Nonlin, Synth-Nonlin Large, and Synth-sc RNA. ... Finally, we use three real-world datasets with unknown ground-truth concepts: (1) PBMC (10x Genomics, 2016a;b) as a high-dimensional single-cell transcriptomic dataset, (2) Higgs (Aad et al., 2012) as a large real-world physics tabular dataset... and (3) FICO (Fair Isaac Corporation, 2019) as a high-stakes financial task...
Dataset Splits	Yes	For each method, and across all tasks, we split each task s dataset into 80% training data and 20% test data and generate a validation set by randomly sampling without substitution 20% of the training data.
Hardware Specification	No	We choose a specific batch size to maximise GPU utilisation while remaining within our hardware s memory capabilities. We used an Adam optimiser (Kingma & Ba, 2014) with learning rate 10 3, momentum 0.99, and standard hyperparameters β1 = 0.9 and β2 = 0.999, across all methods and tasks. ... With this aim in mind, we fix the architecture used across methods to be the same for a given dataset. We select architectures that are simple to train, yet large and expressive enough to perform well in each task of interest; with the constraint that they should train in our GPU cluster within reasonable times.
Software Dependencies	No	We built our code base using a combination of Tensor Flow (Abadi et al., 2016) and Py Torch (Paszke et al., 2019) and implemented Tab CBM in Tensor Flow. All of the code needed to reproduce our results, and use our model through a simple API, has been made public at https://github.com/mateoespinosa/tabcbm via an MIT license.
Experiment Setup	Yes	For each method, and across all tasks, we split each task s dataset into 80% training data and 20% test data and generate a validation set by randomly sampling without substitution 20% of the training data. ... We used an Adam optimiser (Kingma & Ba, 2014) with learning rate 10 3, momentum 0.99, and standard hyperparameters β1 = 0.9 and β2 = 0.999, across all methods and tasks. ... Hyperparameter values used for each dataset are reported in Table 5.