The Score-Difference Flow for Implicit Generative Modeling
Authors: Romann M. Weber
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we report several experiments on toy data. We focus in this section on particle-optimization applications using the kernel-based definition of SD flow (Section 3.1). This allows us to compare the performance of SD flow against two other kernel-based particle-optimization algorithms, namely MMD gradient flow (Arbel et al., 2019) and Stein variational gradient descent (SVGD) (Liu & Wang, 2016). |
| Researcher Affiliation | Industry | Romann M. Weber EMAIL Disney Research: Studios |
| Pseudocode | Yes | 6 Algorithms In this section we provide pseudocode algorithms for two main applications of SD flow: (1) direct sampling via particle optimization, which transports a set of particles from a source distribution to match a target distribution, interpolating between the distributions in the process, and (2) the model optimization of a parametric generator by (a) progressively perturbing generator output toward the target distribution and then (b) regressing those perturbed targets on the generator input. Algorithm 1 Particle optimization with SD flow Algorithm 2 Model optimization with SD flow |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a repository link, an explicit code release statement, or code in supplementary materials. It mentions leaving "a thorough exploration of our alternative specifications of SD flow (Section 3.2) on high-dimensional image data to follow-up work". |
| Open Datasets | No | The paper mentions "toy data", a "25-Gaussian Grid", a "Gaussian Mixture" in the shape of a question mark, and "Swiss roll data". These are either synthetic data generated by the authors or well-known conceptual datasets, but no concrete access information (link, DOI, specific repository, or formal citation with author names and year) is provided for any of them. |
| Dataset Splits | No | The paper describes experiments on "toy data" and discusses particle optimization and model optimization. However, it does not specify any training/test/validation dataset splits, as would typically be found in machine learning experiments with distinct datasets. The experiments seem to involve evolving a source distribution towards a target distribution, rather than splitting a dataset for model evaluation. |
| Hardware Specification | No | The paper discusses the computational demands for high-dimensional image data but does not provide specific hardware details (like GPU/CPU models, memory) used for running its own experiments on toy data. It mentions "our experiments in R3" and "experiments on our kernel-based implementation" but no hardware. |
| Software Dependencies | No | The paper mentions "Ada Grad (Duchi et al., 2011)" and "standard stochastic gradient descent (SGD)" as optimization algorithms used in experiments. However, it does not specify any software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn with specific versions) that were used to implement these algorithms or the models. |
| Experiment Setup | Yes | For the experiments reported in this section, we vary the conditions below, which we describe along with the labels used in Tables 1 and 2. All methods were tested with a default learning rate of η = 0.1. BATCH: The moderate size of our toy data sets allows us to test both batch-based and full-data versions of the algorithms. Batch sizes of 128 were used in the batch condition. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 10 and σ2 min = 0.5. For the non-constant condition, we used a cosine noise schedule (described above) with σ2 max = 4 and σ2 min = 0.5. We performed 1000 steps of SD flow using Algorithm 2 with a constant noise schedule of 10 times the average distance of the initial synthetic (base) distribution to first nearest neighbors in the target distribution (corresponding to σ2 > 700), with a batch size of 1024, an SD flow step size of η = 1, and a regression learning rate of λ = 10-3. |