Discrete Object Generation with Reversible Inductive Construction
Authors: Ari Seff, Wenda Zhou, Farhan Damani, Abigail Doyle, Ryan P. Adams
NeurIPS 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed approach on two highly structured discrete domains, molecules and Laman graphs, and find that it compares favorably to alternative methods at capturing distributional statistics for a host of semantically relevant metrics. Quantitative evaluation indicates that the proposed method can effectively model highly structured discrete distributions while adhering to strict validity constraints. |
| Researcher Affiliation | Academia | Ari Seff Princeton University Princeton, NJ EMAIL Wenda Zhou Columbia University New York, NY EMAIL Farhan Damani Princeton University Princeton, NJ EMAIL Abigail Doyle Princeton University Princeton, NJ EMAIL Ryan P. Adams Princeton University Princeton, NJ EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks for its own method. |
| Open Source Code | Yes | We formulate our approach, generative reversible inductive construction (Gen RIC)1, as the equilibrium distribution of a Markov chain that only visits valid objects, without a need for inefficient rejection sampling. 1https://github.com/Princeton LIPS/reversible-inductive-construction |
| Open Datasets | Yes | For molecules we test the proposed approach on the ZINC dataset, which contains about 250K drug-like molecules from the ZINC database [35]. For Laman graphs, we generate synthetic graphs randomly via Algorithm 7 from Moussaoui [29], originally proposed for evaluating geometric constraint solvers embedded within CAD programs. |
| Dataset Splits | Yes | The model is trained on 220K molecules according to the same train/test split as in Jin et al. [19], Kusner et al. [21]. |
| Hardware Specification | No | We acknowledge computing resources from Columbia University s Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010. This statement describes funding and a facility but lacks specific hardware details (e.g., GPU/CPU models). |
| Software Dependencies | No | The paper mentions software like 'RDKit [24]' but does not provide specific version numbers for software dependencies used in their own experiments. |
| Experiment Setup | No | Unless otherwise stated, the results reported in Sections 3 and 4, use a geometric distribution with five expected steps for the corruption sequence length. For each method, we obtain 20K samples either by running pre-trained models [19, 14, 21], by accessing pre-sampled sets [26, 34, 25], or by training models from scratch [33]2. While some details are given, comprehensive hyperparameter values (e.g., learning rate, batch size, specific optimizer settings) are not provided in the main text. |