reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Score-Difference Flow for Implicit Generative Modeling

Authors: Romann M. Weber

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here we report several experiments on toy data. We focus in this section on particle-optimization applications using the kernel-based definition of SD flow (Section 3.1). This allows us to compare the performance of SD flow against two other kernel-based particle-optimization algorithms, namely MMD gradient flow (Arbel et al., 2019) and Stein variational gradient descent (SVGD) (Liu & Wang, 2016).
Researcher Affiliation	Industry	Romann M. Weber EMAIL Disney Research: Studios
Pseudocode	Yes	6 Algorithms In this section we provide pseudocode algorithms for two main applications of SD flow: (1) direct sampling via particle optimization, which transports a set of particles from a source distribution to match a target distribution, interpolating between the distributions in the process, and (2) the model optimization of a parametric generator by (a) progressively perturbing generator output toward the target distribution and then (b) regressing those perturbed targets on the generator input. Algorithm 1 Particle optimization with SD flow Algorithm 2 Model optimization with SD flow
Open Source Code	No	The paper does not provide any concrete access to source code, such as a repository link, an explicit code release statement, or code in supplementary materials. It mentions leaving "a thorough exploration of our alternative specifications of SD flow (Section 3.2) on high-dimensional image data to follow-up work".
Open Datasets	No	The paper mentions "toy data", a "25-Gaussian Grid", a "Gaussian Mixture" in the shape of a question mark, and "Swiss roll data". These are either synthetic data generated by the authors or well-known conceptual datasets, but no concrete access information (link, DOI, specific repository, or formal citation with author names and year) is provided for any of them.
Dataset Splits	No	The paper describes experiments on "toy data" and discusses particle optimization and model optimization. However, it does not specify any training/test/validation dataset splits, as would typically be found in machine learning experiments with distinct datasets. The experiments seem to involve evolving a source distribution towards a target distribution, rather than splitting a dataset for model evaluation.
Hardware Specification	No	The paper discusses the computational demands for high-dimensional image data but does not provide specific hardware details (like GPU/CPU models, memory) used for running its own experiments on toy data. It mentions "our experiments in R3" and "experiments on our kernel-based implementation" but no hardware.
Software Dependencies	No	The paper mentions "Ada Grad (Duchi et al., 2011)" and "standard stochastic gradient descent (SGD)" as optimization algorithms used in experiments. However, it does not specify any software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn with specific versions) that were used to implement these algorithms or the models.
Experiment Setup	Yes	For the experiments reported in this section, we vary the conditions below, which we describe along with the labels used in Tables 1 and 2. All methods were tested with a default learning rate of η = 0.1. BATCH: The moderate size of our toy data sets allows us to test both batch-based and full-data versions of the algorithms. Batch sizes of 128 were used in the batch condition. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 10 and σ2 min = 0.5. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 4 and σ2 min = 0.5. We performed 1000 steps of SD flow using Algorithm 2 with a constant noise schedule of 10 times the average distance of the initial synthetic (base) distribution to first nearest neighbors in the target distribution (corresponding to σ2 > 700), with a batch size of 1024, an SD flow step size of η = 1, and a regression learning rate of λ = 10-3.