The Score-Difference Flow for Implicit Generative Modeling

Authors: Romann M. Weber

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we report several experiments on toy data. We focus in this section on particle-optimization applications using the kernel-based definition of SD flow (Section 3.1). This allows us to compare the performance of SD flow against two other kernel-based particle-optimization algorithms, namely MMD gradient flow (Arbel et al., 2019) and Stein variational gradient descent (SVGD) (Liu & Wang, 2016).
Researcher Affiliation Industry Romann M. Weber EMAIL Disney Research: Studios
Pseudocode Yes 6 Algorithms In this section we provide pseudocode algorithms for two main applications of SD flow: (1) direct sampling via particle optimization, which transports a set of particles from a source distribution to match a target distribution, interpolating between the distributions in the process, and (2) the model optimization of a parametric generator by (a) progressively perturbing generator output toward the target distribution and then (b) regressing those perturbed targets on the generator input. Algorithm 1 Particle optimization with SD flow Algorithm 2 Model optimization with SD flow
Open Source Code No The paper does not provide any concrete access to source code, such as a repository link, an explicit code release statement, or code in supplementary materials. It mentions leaving "a thorough exploration of our alternative specifications of SD flow (Section 3.2) on high-dimensional image data to follow-up work".
Open Datasets No The paper mentions "toy data", a "25-Gaussian Grid", a "Gaussian Mixture" in the shape of a question mark, and "Swiss roll data". These are either synthetic data generated by the authors or well-known conceptual datasets, but no concrete access information (link, DOI, specific repository, or formal citation with author names and year) is provided for any of them.
Dataset Splits No The paper describes experiments on "toy data" and discusses particle optimization and model optimization. However, it does not specify any training/test/validation dataset splits, as would typically be found in machine learning experiments with distinct datasets. The experiments seem to involve evolving a source distribution towards a target distribution, rather than splitting a dataset for model evaluation.
Hardware Specification No The paper discusses the computational demands for high-dimensional image data but does not provide specific hardware details (like GPU/CPU models, memory) used for running its own experiments on toy data. It mentions "our experiments in R3" and "experiments on our kernel-based implementation" but no hardware.
Software Dependencies No The paper mentions "Ada Grad (Duchi et al., 2011)" and "standard stochastic gradient descent (SGD)" as optimization algorithms used in experiments. However, it does not specify any software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn with specific versions) that were used to implement these algorithms or the models.
Experiment Setup Yes For the experiments reported in this section, we vary the conditions below, which we describe along with the labels used in Tables 1 and 2. All methods were tested with a default learning rate of η = 0.1. BATCH: The moderate size of our toy data sets allows us to test both batch-based and full-data versions of the algorithms. Batch sizes of 128 were used in the batch condition. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 10 and σ2 min = 0.5. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 4 and σ2 min = 0.5. We performed 1000 steps of SD flow using Algorithm 2 with a constant noise schedule of 10 times the average distance of the initial synthetic (base) distribution to first nearest neighbors in the target distribution (corresponding to σ2 > 700), with a batch size of 1024, an SD flow step size of η = 1, and a regression learning rate of λ = 10-3.