Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

When do GFlowNets learn the right distribution?

Authors: Tiago Silva, Rodrigo Alves, Eliezer de Souza da Silva, Amauri Souza, Vikas Garg, Samuel Kaski, Diego Mesquita

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Importantly, all sections of this work provide experiments to substantiate our theoretical analyses, illustrating the claims and demonstrating the practical relevance of the methodological contributions.
Researcher Affiliation Collaboration Tiago da Silva Rodrigo Barreto Alves Eliezer de Souza da Silva Getulio Vargas Foundation EMAIL Amauri Souza Federal Institute of CearΓ‘ EMAIL Vikas Garg Yai Yai Ltd and Aalto University EMAIL Samuel Kaski Aalto University, Manchester University EMAIL Diego Mesquita Getulio Vargas Foundation EMAIL
Pseudocode No The paper describes methods and equations in text format, but there are no explicitly labeled pseudocode or algorithm blocks.
Open Source Code No To avoid implementation bias, we reproduced our experiments using Pan et al. (2023a) s publicly released code1 and obtained similar results. 1Available online at github.com/ling-pan/FL-GFN.
Open Datasets Yes We reproduce the experiments in Figure 8 for the task of sampling DNA sequences of length 10 in proportion to a reward function defined by wet-lab measurements of the sequence s binding affinity to a yeast transcription factor (PHO4) (Shen et al., 2023; Jain et al., 2022; Barrera et al., 2016; Trabucco et al., 2022).
Dataset Splits No The paper focuses on generative tasks and sampling from distributions, rather than using predefined datasets with explicit train/test/validation splits. For example, for 'Set generation', it states: 'The support X is defined as the collection of sets with 16 elements sampled from a deposit D = {1, . . . , 32}'. No specific dataset splits are mentioned for any of the tasks.
Hardware Specification Yes Experiments were run in a cluster equipped with A100 GPUs, using a single GPU per run.
Software Dependencies No All experiments relied on Adam (Kingma & Ba, 2014) with a learning rate of 10 3 for p F and 10 2 for log Z for stochastic optimization (Madan et al., 2022).
Experiment Setup Yes All experiments relied on Adam (Kingma & Ba, 2014) with a learning rate of 10 3 for p F and 10 2 for log Z for stochastic optimization (Madan et al., 2022). (...) Hypergrid. (...) we use a batch size of 16 trajectories and train the model for 62500 epochs (106 trajectories). We parameterize the forward policy with a MLP composed of 2 128-dimensional layers. (...) To parameterize the policies of both LAand the standard GFlow Nets, we use a 3-layer GIN (Xu et al., 2019) having 32-dimensional layer, followed by an MLP of 2 32-dimensional layers.