Generalization and Distributed Learning of GFlowNets

Authors: Tiago Silva, Amauri Souza, Omar Rivasplata, Vikas Garg, Samuel Kaski, Diego Mesquita

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments with synthetic and real-world problems demonstrate the benefits of SAL over centralized training in terms of mode coverage and distribution matching.
Researcher Affiliation Collaboration Tiago da Silva Getulio Vargas Foundation EMAIL Amauri Souza Federal Institute of Cear a EMAIL Omar Rivasplata University of Manchester EMAIL Vikas Garg Yai Yai Ltd and Aalto University EMAIL Samuel Kaski Aalto University, University of Manchester EMAIL Diego Mesquita Getulio Vargas Foundation EMAIL
Pseudocode Yes Algorithm 1 Subgraph Asynchronous Learning
Open Source Code No The paper does not explicitly state that code for the methodology described in this paper is available or provide a link to a repository. While it acknowledges another paper's code being public, it does not do so for its own.
Open Datasets Yes 1. Hypergrid (Bengio et al., 2021; Malkin et al., 2022; 2023; Pan et al., 2023b; Krichel et al., 2024). 2. SIX6 (Jain et al., 2022; Malkin et al., 2022; Shen et al., 2023; Chen & Mauch, 2024; Kim et al., 2024a). ... Barrera et al., 2016; Trabucco et al., 2022. 3. PHO4 (Jain et al., 2022; Malkin et al., 2022; Shen et al., 2023; Chen et al., 2023). ... Barrera et al., 2016; Trabucco et al., 2022.
Dataset Splits Yes For this, we disjointly partition the dataset Tn with n = 3 104 into sets Tα and T1 α with α = 0.6.
Hardware Specification Yes All experiments were conducted on a single Linux machine with 128 GB of RAM and featuring a NVIDIA RTX 3090 GPU and 12th Gen Intel(R) Core(TM) i9-12900K CPU. Unless specified otherwise, the code for reproducing the experiments below was executed on this GPU.
Software Dependencies No The paper mentions optimizers like SGD and Adam, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in their implementation.
Experiment Setup Yes For the experiments in Figure 1, we considered the set generation task (see Appendix A) with W {32, 64} elements to choose from and set size S = 6, and the forward policy was parameterized by an MLP with 2 64-dimensional layers. The elements log-utilities u were sampled from [ 1, 1] prior to training and the resulting values were normalized so that the largest reward of a set was 5. For both settings in Figure 1, the models were trained for 1500 epochs with a batch size of 128.