reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Addressing Attribute Bias with Adversarial Support-Matching

Authors: Thomas Kehrenberg, Myles Bartlett, Viktoriia Sharmanska, Novi Quadrianto

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our method on several datasets and realisations of the problem. Code for the paper is publicly available at https://github.com/wearepal/support-matching. We perform experiments on three publicly available image datasets Coloured MNIST (following a similar data-generation process to Kehrenberg et al., 2020), Celeb A (Liu et al., 2015), and NICO++ (Zhang et al., 2023) (following the split-construction procedure of Yang et al., 2023) and one tabular dataset, American Community Survey (ACS).
Researcher Affiliation	Academia	Thomas Kehrenberg EMAIL Predictive Analytics Lab (PAL), University of Sussex Myles Bartlett EMAIL Predictive Analytics Lab (PAL), University of Sussex Viktoriia Sharmanska EMAIL Predictive Analytics Lab (PAL), University of Sussex Novi Quadrianto EMAIL Predictive Analytics Lab (PAL), University of Sussex BCAM Severo Ochoa Strategic Lab on Trustworthy Machine Learning Monash University, Indonesia
Pseudocode	Yes	Algorithm 1 Adversarial Support Matching Input: Number of encoder updates N enc, number of discriminator updates N disc, encoders f and t, decoder r, discriminator h, training set Dtr, deployment set Ddep Output: Debiaser f with learned invariance to A for i 1 to N enc do Encoder update loop Sample batches of perfect bags Btr Dtr and Bdep Ddep using Π (Eq. 4) Compute Lenc using Eq. 10 Update f, t, and r by descending in the direction Lenc for j 1 to N disc do Discriminator update loop Sample batches of perfect bags Btr Dtr and Bdep Ddep using Π (Eq. 4) Compute Lmatch using Eq. 10 Update h by ascending in the direction Lmatch end for end for
Open Source Code	Yes	Code for the paper is publicly available at https://github.com/wearepal/support-matching.
Open Datasets	Yes	We perform experiments on three publicly available image datasets Coloured MNIST (following a similar data-generation process to Kehrenberg et al., 2020), Celeb A (Liu et al., 2015), and NICO++ (Zhang et al., 2023) (following the split-construction procedure of Yang et al., 2023) and one tabular dataset, American Community Survey (ACS).
Dataset Splits	Yes	Groups with fewer than 75 samples are designated as the missing groups and removed from the training set; 50 and 25 samples from each group are set aside for the deployment and test sets, respectively, where we have adapted the scheme to our setup by co-opting the original validation set as the deployment set.
Hardware Specification	No	The paper discusses the use of a ResNet50 encoder as part of the model architecture (Table 16, Appendix G.4) but does not provide details on the specific hardware (CPU/GPU models, memory) used to run the experiments.
Software Dependencies	No	We train all models using the Adam W (Loshchilov & Hutter, 2018) optimiser.
Experiment Setup	Yes	The hyperparameters and architectures for the Auto Encoder (AE), Predictor and Discriminator subnetworks are detailed in Table 14 for the Coloured MNIST and Celeb A datasets, and in Table 15 for the ACS dataset. The hyperparameters associated with our NICO++ experiments are tabulated separately in Table 16, owing to the fixed AE architecture consisting of a Res Net50 (He et al., 2016) encoder and mirroring decoder and lack of clustering. We train all models using the Adam W (Loshchilov & Hutter, 2018) optimiser.