reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Flow Q-Learning

Authors: Seohong Park, Qiyang Li, Sergey Levine

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally show that FQL leads to strong performance across 73 challenging stateand pixel-based OGBench and D4RL tasks in offline RL and offline-to-online RL.
Researcher Affiliation	Academia	1University of California, Berkeley. Correspondence to: Seohong Park <EMAIL>.
Pseudocode	Yes	Algorithm 1 Flow Q-Learning (FQL)
Open Source Code	Yes	We provide our full implementation and exact commands to reproduce the main results of FQL at https://github.com/seohongpark/fql.
Open Datasets	Yes	We empirically show the effectiveness of FQL on 73 diverse stateand pixel-based tasks across the recently proposed OGBench (Park et al., 2025) and standard D4RL (Fu et al., 2020) benchmarks.
Dataset Splits	No	The paper describes using offline datasets for training and evaluating the policy's performance in the environment but does not specify explicit training/test/validation splits for the datasets themselves.
Hardware Specification	Yes	The run times are measured on the same machine using a single A5000 GPU, and are averaged over 8 seeds.
Software Dependencies	No	The paper mentions implementing FQL in JAX and using Adam optimizer and GELU nonlinearity, but does not provide specific version numbers for JAX or other key software libraries.
Experiment Setup	Yes	We provide the complete list of hyperparameters in Table 5 and task-specific hyperparameters in Tables 6 and 7.