reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CLEAR: Generative Counterfactual Explanations on Graphs

Authors: Jing Ma, Ruocheng Guo, Saumitra Mishra, Aidong Zhang, Jundong Li

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on both synthetic and real-world graphs validate the superiority of CLEAR over the state-of-the-art methods in different aspects. In this section, we evaluate our framework CLEAR with extensive experiments on both synthetic and real-world graphs.
Researcher Affiliation	Collaboration	Jing Ma University of Virginia Charlottesville, VA, USA EMAIL Ruocheng Guo Bytedance AI Lab London, UK EMAIL Saumitra Mishra J.P. Morgan AI Research London, UK EMAIL Aidong Zhang University of Virginia Charlottesville, VA, USA EMAIL Jundong Li University of Virginia Charlottesville, VA, USA EMAIL
Pseudocode	No	The paper describes the components and mechanism of CLEAR but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of our proposed framework is included in the supplemental material. It will be released after publication.
Open Datasets	Yes	We evaluate our method on three datasets, including a synthetic dataset and two datasets with realworld graphs. (1) Community. This dataset contains synthetic graphs generated by the Erdös-Rényi (E-R) model [24]. (2) Ogbg-molhiv. In this dataset, each graph stands for a molecule... (3) IMDB-M. This dataset contains movie collaboration networks from IMDB. In the ethics checklist, it states: 'The creators developed the adopted datasets based on either publicly available data or data with proper consent.'
Dataset Splits	No	The provided text indicates that data splits were specified in Appendix B ('Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Section 4, Appendix B and C.'), but the main body of the paper does not explicitly detail the training/validation/test dataset splits with percentages or sample counts.
Hardware Specification	No	The paper states that 'total amount of compute and the type of resources used' are in Appendix B ('Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix B.'). However, no specific hardware details (e.g., GPU models, CPU types) are provided in the main text.
Software Dependencies	No	The paper does not provide specific software dependencies (e.g., library names with version numbers like PyTorch 1.9 or Python 3.8) needed to replicate the experiment in the provided text.
Experiment Setup	Yes	We set the desired label Y for each graph as its flipped label (e.g., if Y = 0, then Y = 1). For each graph, we generate three counterfactuals for it (N CF = 3). Other setup details are in Appendix B. We vary α {0.01, 0.1, 1.0, 5.0, 10.0} and β {0.01, 0.1, 1.0, 10.0, 100}.