reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Counterfactual Fairness on Graphs: Augmentations, Hidden Confounders, and Identifiability

Authors: Hongyi Ling, Zhimeng Jiang, Na Zou, Shuiwang Ji

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate the effectiveness of our method in improving the counterfactual fairness of classifiers on various graph tasks. Moreover, theoretical analysis, coupled with empirical results, illustrates the capability of our method to successfully identify hidden confounders.
Researcher Affiliation	Academia	Hongyi Ling EMAIL Department of Computer Science & Engineering Texas A&M University Zhimeng Jiang EMAIL Department of Computer Science & Engineering Texas A&M University Na Zou EMAIL Department of Industrial Engineering University of Houston Shuiwang Ji EMAIL Department of Computer Science & Engineering Texas A&M University
Pseudocode	No	The paper describes its methodology in Section 3 and provides theoretical proofs in Section 4 and Appendix A, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository for the described methodology.
Open Datasets	Yes	We further demonstrate the advance of our method using three real-world datasets (Agarwal et al., 2021), including German, Credit, and Bail.
Dataset Splits	Yes	Each dataset is randomly partitioned into training, validation, and test sets, at proportions of 80%, 10%, and 10%, respectively. In the synthetic datasets, we can fully manipulate the data generation process and thus easily generate the counterfactual graphs. For each node, we flip its sensitive attribute to get a new sensitive attribute vector S . Counterfactual graphs are then generated as G(S S ) = {A(S S ), X(S S ), S }, where A(S S ) = FA(Z, S ) and X(S S ) = FX(Z, A(S S ), S ). See Appendix D.1 for more details. For all three datasets, we randomly split 80%/10%/10% for training, validation, and test datasets.
Hardware Specification	Yes	We use NVIDIA RTX A6000 GPUs for all our experiments.
Software Dependencies	No	The paper mentions using the Adam optimizer (Kingma & Ba, 2015), GCN (Kipf & Welling, 2017), and Tetrad (Ramsey et al., 2018), but it does not specify version numbers for these or any other software libraries or frameworks.
Experiment Setup	Yes	For the classification model, we use a GCN model. The number of GCN layers is two, and we use a global mean pooling as the readout function. We set the hidden size as 16. The activation function is Re LU. We use the Adam optimizer (Kingma & Ba, 2015) to train the classification model with 1 × 10−4 learning rate and 1 × 10−4 weight decay. In our experiments on synthetic datasets, we set the dimensionality of the hidden confounders and the number of components to match the data generation process’s ground truth. For real-world datasets, we align the dimensionality of the hidden confounders with that of the node features, setting the number of components to eight.