reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NEVAE: A Deep Generative Model for Molecular Graphs

Authors: Bidisha Samanta, Abir De, Gourhari Jana, Vicenç Gómez, Pratim Chattaraj, Niloy Ganguly, Manuel Gomez-Rodriguez

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments reveal that our variational autoencoder can discover plausible, diverse and novel molecules more effectively than several state of the art models.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, IIT Kharagpur, Kharagpur, India 2Department of Computer Science and Engineering, IIT Bombay, Mumbai, India 3Department of Chemistry, IIT Kharagpur, Kharagpur, India 4DTIC, Universitat Pompeu Fabra, Barcelona, Spain 5Max Planck Institute for Software Systems, Kaiserslautern, Germany
Pseudocode	Yes	Algorithm 1: PROPERTYORIENTEDDECODER: it trains a parameterized propertyoriented decoder.
Open Source Code	Yes	To facilitate research in this area, we are releasing an open source implementation of our model in Tensorﬂow as well as synthetic and real-world data used in our experiments1. 1. https://github.com/Networks-Learning/nevae
Open Datasets	Yes	We experiment with molecules from two publicly available datasets, ZINC (Irwin et al., 2012) and QM9 (Ramakrishnan et al., 2014).
Dataset Splits	Yes	More speciﬁcally, we ﬁrst sample 3,000 molecules from our ZINC dataset, which we split into training (90%) and test (10%) sets.
Hardware Specification	Yes	We carried out all our experiments for NEVAE using Tensorﬂow 1.4.1, on a 64 bit Debian distribution with 16 core Intel Xenon CPU (E5-2667 v4 @3.20 GHz) and 512GB RAM.
Software Dependencies	Yes	We carried out all our experiments for NEVAE using Tensorﬂow 1.4.1, on a 64 bit Debian distribution with 16 core Intel Xenon CPU (E5-2667 v4 @3.20 GHz) and 512GB RAM.
Experiment Setup	Yes	Therein, we had to specify four hyperparameters: (i) D the dimension of zu, (ii) K the maximum number of hops used in encoder to aggregate information, (iii) L the number of negative samples, (iv) lr the learning rate. Note that, all the parameters W s and b s in the input, hidden and output layers depend on D and K. We selected these hyperparameters using cross validation. More speciﬁcally, we varied lr in a logarithmic scale, i.e., {0.0005, 0.005, 0.05, 0.05}, and the rest of the hyperparameters in an arithmetic scale, and chose the hyperparameters maximizing the value of the objective function in the validation set. For synthetic (real) data, the resulting hyperparameter values were D = 7(5), K = 3(5), L = 10(10) and lr = 0.005(0.005). To run the baseline algorithms, we followed the instructions in the corresponding repository (or paper). ... We implemented stochastic gradient descent (SGD) using the Adam optimizer.