reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Discrete Graph Auto-Encoder

Authors: Yoann Boget, Magda Gregorova, Alexandros Kalousis

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through multiple experimental evaluations, we demonstrate the competitive performances of our model in comparison to the existing state-of-the-art across various datasets. Various ablation studies support the interest of our method.
Researcher Affiliation	Academia	Yoann Boget EMAIL Geneva School for Business administration HES-SO University of Geneva Magda Gregorova EMAIL Center for Arti cial Intelligence and Robotics (CAIRO) Technische Hochschule Würzburg-Schweinfurt (THWS) Alexandros Kalousis EMAIL Geneva School for Business administration HES-SO
Pseudocode	No	The paper describes methods and processes in detail using natural language and mathematical equations, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block, nor does it present structured steps in a code-like format.
Open Source Code	Yes	The source code of our model is publicly available at https://github.com/yoboget/dgae.
Open Datasets	Yes	For simple graphs, we evaluated the performance of our model on two datasets of small graphs, namely Ego-Small and Community-Small, with 200 and 100 graphs, respectively as well as Enzymes, a dataset of larger graphs. We evaluated the performance of our model on two widely used datasets for molecule generation: Qm9 (Ramakrishnan et al., 2014) and Zinc250k (Irwin et al., 2012).
Dataset Splits	Yes	In particular, we split the datasets between training and test sets with a ratio of 80% 20% as (Jo et al., 2022) and use the exact same split when available.
Hardware Specification	Yes	We compute the clock time to generate 1000 graphs in the rdkit format on one Ge Force RTX 3070 GPU and 12 CPU cores.
Software Dependencies	No	The paper mentions software components like 'pytorch-geometric' and 'rdkit' but does not provide specific version numbers for these or any other key software dependencies.
Experiment Setup	Yes	Appendix A.3 reports the details of the models, the hyperparameters choices and the resulting number of parameters used for the experiments. In our experiments, we use the p path method setting p = 3. m being the codebook size, is a hyper-parameter. We compare ten con gurations for various codebook size m and number of partition C, representing three sizes of possible codeword sequences M. For Ego-Small and Community-Small, our reported results are the average of fteen runs, i.e., fteen generation batches with the test set size used for evaluation against the test set: three runs from ve models trained independently. For the Enzymes dataset, which contains much larger graphs, we follow (Jo et al., 2022) and average over three runs from a single model.