Accelerating optimization over the space of probability measures
Authors: Shi Chen, Qin Li, Oliver Tse, Stephen J. Wright
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We complement our findings with numerical examples. In this section, we report on numerical experiments with the Hamiltonian flows introduced above. In Section 6.1 we lay out the algorithms for running (HBF) and (VAF) using representative particles, while in Section 6.2, we showcase the application of the algorithms in two specific examples: potential energy and Bayesian sampling. We consider only continuous-time models in this section, deferring the development of discrete-in-time algorithms to future research. (Section 6: Algorithms and Numerical Experiments) and Figure 1, Figure 2, Figure 3, Figure 4, Figure 5. |
| Researcher Affiliation | Academia | Shi Chen EMAIL Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139, USA; Qin Li EMAIL Department of Mathematics University of Wisconsin-Madison Madison, WI 53706, USA; Oliver Tse EMAIL Department of Mathematics and Computer Science and Eindhoven Hendrik Casimir Institute Eindhoven University of Technology 5600 MB Eindhoven, The Netherlands; Stephen J. Wright EMAIL Department of Computer Sciences University of Wisconsin-Madison Madison, WI 53706, USA. |
| Pseudocode | Yes | Heavy-ball flow (HBF): 9Xi Vi , 9Vi a Vi x δE δρ rµX t sp Xiq , i 1, . . . , N . (94) Variational-acceleration-flow (VAF) in its general form: 9Xi Vi , 9Vi p9γt 9αtq Vi e2αt βt x δE δρ rµX t sp Xiq , i 1, . . . , N . (95) Nesterov flow as an example of (VAF) using the coefficients (48): 9Xi Vi , 9Vi 3 t Vi 9 t2 x δE δρ rµX t sp Xiq , i 1, . . . , N . (96) Exponential convergence as an example of (VAF) using coefficients rαt, βt, γts r0, t, ts: 9Xi Vi , 9Vi Vi et x δE δρ rµX t sp Xiq , i 1, . . . , N . (97) |
| Open Source Code | No | The numerical integration over time is performed using the Diffrax library (Kidger, 2021). The paper does not state that the authors' own implementation code is available or will be made available. |
| Open Datasets | No | Example 1: Potential Energy. We consider potential energy Erρs Vℓrρs ż Rd Vℓpxq dρ , ℓ 1, 2, with two different forms of the potential functions: ... For the potential V1, we set spatial dimension to be d 500, with A P R500ˆ500 a random symmetric positive definite matrix and b is a random vector. ... For potential V2, we take d 200 and choose M 1000 and h 20. Example 2: Bayesian Sampling. Next, we tackle the more challenging task of minimizing KL divergence between ρ and a target distribution ρ , defined by ... We choose two different target measures ρ by specifying the log-density g in the same manner as the potential functions in (100)... Example 3: Neural network training. ... We set the spatial dimension to be d 1 with the target function being fpxq sinpπxq. We choose the data distribution to be the uniform distribution over r 1, 1s and 500 data points are sampled to evaluate the integration in π. The paper describes synthetic data generation for its examples and does not provide specific access information for any publicly available or open datasets. |
| Dataset Splits | No | The paper describes generating data for its examples (e.g., "500 data points are sampled") but does not specify any explicit train/test/validation splits or methodologies for partitioning these generated datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The numerical integration over time is performed using the Diffrax library (Kidger, 2021). We use the Dormand-Prince 5/4 method with the default adaptive step size controller, setting both relative and absolute tolerances to 10 6. This mentions a library and a method but does not provide specific version numbers for the software dependencies themselves (e.g., "Diffrax 0.X.Y"). |
| Experiment Setup | Yes | The numerical integration over time is performed using the Diffrax library (Kidger, 2021). We use the Dormand-Prince 5/4 method with the default adaptive step size controller, setting both relative and absolute tolerances to 10 6. The initial conditions of the particles are independently sampled from the standard Gaussian distribution: Xip0q, V ip0q i.i.d. Np0, Idq. For strongly convex functions, the choice a 2?m used in the analysis is too small to produce optimal computational performance. Thus in all examples, heavy-ball flow is executed with a 0.5. (Example 1) We use N 100 particles for both V1 and V2. (Example 2) In the numerical results below, we use N 1600 particles. We choose ϵ 1 (deferring the issue of choosing ϵ more optimally to future work). (Example 3) We set the spatial dimension to be d 1 with the target function being fpxq sinpπxq. We choose the data distribution to be the uniform distribution over r 1, 1s and 500 data points are sampled to evaluate the integration in π. In the numerical results below, we use N 200 particles (neurons). |