A Variational Perspective on Generative Flow Networks

Authors: Heiko Zimmermann, Fredrik Lindsten, Jan-Willem van de Meent, Christian A Naesseth

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our findings and the performance of the proposed variational objective numerically by comparing it to the trajectory balance objective on two synthetic tasks.
Researcher Affiliation Academia Heiko Zimmermann EMAIL Amsterdam Machine Learning Lab, University of Amsterdam Fredrik Lindsten EMAIL Division of Statistics and Machine Learning, Linköping University Jan-Willem van de Meent EMAIL Amsterdam Machine Learning Lab, University of Amsterdam Christian A. Naesseth EMAIL Amsterdam Machine Learning Lab, University of Amsterdam
Pseudocode No The paper describes methods and mathematical formulations but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes To model a discrete ground-truth distribution πGT over terminating states we follow Dai et al. (2020); Zhang et al. (2022b) and discretize a continuous distribution πcont GT : R2 R+ into 216 equally sized grid cells along each dimension. The cells are remapped to Gray code such that neighbouring grid cells differ in exactly one bit and the resulting pair of 16-bit vectors is concatenated to obtain a single 32-bit vector. We are interested in two settings: (1) Learning a forward model Q(τ; ϕ) such that its marginal distribution QT (s T ; ϕ) approximates a fixed distribution πT (s T ) = πGT(s T ) over terminating states, and (2) learning a forward model and energy function ξ : {0, 1}32 R jointly such that πT (s T ; θ) exp( ξ(s T , θ)) πGT(s T ).
Dataset Splits No The paper refers to 'test data' in Table 2, but does not explicitly specify the methodology or percentages used to split datasets into training, validation, or test sets.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models.
Software Dependencies No Footnote 2 states: 'We used jax (Bradbury et al., 2018) to efficiently compute per-sample gradients to estimate the optimal control variates.' While JAX is mentioned as a tool, a specific version number for JAX itself is not provided.
Experiment Setup Yes For all experiments we trained the forward model using the Adam optimizer with a learning rate α = 1e 3 and batch size B = 256.