On the Convergence of SVGD in KL divergence via Approximate gradient flow
Authors: Masahiro Fujisawa, Futoshi Futami
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we validate our theoretical findings through several numerical experiments. The code to reproduce our experiments is available at https://github.com/msfuji0211/svgd_convergence. |
| Researcher Affiliation | Collaboration | Masahiro Fujisawa EMAIL The University of Osaka RIKEN Center for Advanced Intelligence Project Lattice Lab, Toyota Motor Corporation Futoshi Futami EMAIL The University of Osaka RIKEN Center for Advanced Intelligence Project |
| Pseudocode | No | The paper describes methods and mathematical derivations but does not contain any clearly labeled pseudocode or algorithm blocks. The procedures are explained in paragraph text. |
| Open Source Code | Yes | The code to reproduce our experiments is available at https://github.com/msfuji0211/svgd_convergence. |
| Open Datasets | Yes | Table 1: Experimental Setup Component Parameter Value Dataset Name Covertype (UCI Repository) Preprocessing Binary classification, pre-scaled features Data size for MCMC 10,000 samples Splitting 80% training, 20% testing |
| Dataset Splits | Yes | Table 1: Experimental Setup Component Parameter Value Dataset Name Covertype (UCI Repository) ... Splitting 80% training, 20% testing |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It focuses on the experimental setup in terms of software and parameters. |
| Software Dependencies | No | The paper mentions 'Cmd Stan Py' in Appendix F.2 but does not specify its version or the versions of any other key software libraries or programming languages used. |
| Experiment Setup | Yes | We set the target distribution as the two-dimensional Gaussian distribution... We adopted the RBF kernel k(x, y) = exp( 1/h x x 2 2)... The bandwidth h was selected by the median trick as in Liu & Wang (2016). To appropriately verify our theoretical analysis, we simply set the decaying step size γt = 1/(1 + tβ)(= O(1/tβ)) suggested by Theorem 1 and did not use the Adagrad-based stepsize... We set the initial stepsize as γ0 = 0.01 for all experiments. Table 1: Experimental Setup Component Parameter Value ... SVGD Iterations 10,000 Number of particles (N) Varied across {5, 10, 20, 50} Optimizer Gradient Ascent Base step size (ϵ0) 1e-2 Decay factor (d) 1.0 Decay exponent (β) Varied across {0.0, 0.5, 0.67, 1.0} Kernel RBF with median heuristic Particle Initialization β N(0, 0.1I), ϕ = log τ N(log(0.1), 0.12) Prior Hyperparameters (α0, β0) (1.0, 0.01) |