Revisiting Non-Acyclic GFlowNets in Discrete Environments
Authors: Nikita Morozov, Ian Maksimov, Daniil Tiapkin, Sergey Samsonov
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition, we experimentally re-examine the concept of loss stability in nonacyclic GFlow Net training, as well as validate our own theoretical findings. |
| Researcher Affiliation | Academia | 1HSE University, Moscow, Russia 2CMAP CNRS Ecole polytechnique Institut Polytechnique de Paris, 91128, Palaiseau, France 3Universit e Paris-Saclay, CNRS, LMO, 91405, Orsay, France. Correspondence to: Nikita Morozov <EMAIL>. |
| Pseudocode | No | The paper describes methods and algorithms in paragraph form and through mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code: github.com/Great Drake/non-acyclic-gfn. |
| Open Datasets | Yes | We consider two discrete environments for experimental evaluation: 1) a non-acyclic version of the hypergrid environment (Bengio et al., 2021) that was introduced in (Brunswic et al., 2024); 2) non-acyclic permutation generation environment from (Brunswic et al., 2024) with a harder variant of the reward function. |
| Dataset Splits | No | The paper focuses on generative models that sample objects from a distribution. It evaluates the models based on empirical distributions of generated samples (e.g., 'last 2 * 10^5 samples seen in training' or 'last 10^5 samples seen in training'), rather than using predefined training, validation, and test splits for an input dataset in a discriminative task. |
| Hardware Specification | No | This research was supported in part through computational resources of HPC facilities at HSE University (Kostenetskiy et al., 2021). |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not specify its version or the versions of any other software libraries or programming languages used. |
| Experiment Setup | Yes | We use Adam optimizer with a learning rate of 10^-3 and a batch size of 16 (number of trajectories sampled at each training step). For log Zθ we use a larger learning rate of 10^-2... All models are trained until 2 * 10^6 trajectories are sampled... For SDB we set ε = 1.0 and η = 10^-3. |