Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Mean-Field Langevin Dynamics : Exponential Convergence and Annealing
Authors: Lénaïc Chizat
TMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we show that our results apply to noisy gradient descent on infinitely wide two-layer neural networks and we provide numerical experiments for G being a kernel Maximum Mean Discrepancy (MMD). We conclude this paper with numerical experiments exploring the behavior of NPGD1 defined in (3). Figure 1a shows an example of a large-time particle configuration, with the atoms of ν is red and the atoms of ˆµt in black (with t large), with a noise temperature τ = 0.1. Figure 1b shows the evolution of the objective Fτ = G+τH (up to a constant, adjusted for ease of comparison) along the iterations, where the entropy H is estimated using the 1-nearest-neighbor estimator (Kozachenko and Leonenko, 1987; Singh et al., 2003). Finally, Figure 1c shows the advantage of NPGD with simulated annealing vs. PGD to minimize the unregularized function G. |
| Researcher Affiliation | Academia | Lénaïc Chizat lenaic.chizat@epfl.ch EPFL |
| Pseudocode | No | The paper describes algorithms through mathematical equations and definitions, such as Equation (3) for NPGD and Equation (11) for SDE, but does not include a distinct block or section labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | 1Link to Julia code to reproduce the experiments: https://github.com/lchizat/2022-mean-field-langevin-rate. |
| Open Datasets | No | We take ν as a random empirical distribution of m = 10 samples from the uniform distribution on X. |
| Dataset Splits | No | The paper describes generating a random empirical distribution for ν and running simulations with a fixed number of particles, but does not provide specific training/test/validation splits for any dataset. |
| Hardware Specification | No | The paper does not specify any particular hardware (CPU, GPU, etc.) used for running the numerical experiments. |
| Software Dependencies | No | The paper mentions 'Julia code' in a footnote for reproducing experiments, but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We run NPGD with m = 50 particles, a step-size η = 0.08 and µ0 being the uniform distribution on X. We used a noise temperature that decays polynomially as τt = 20(t + 1)−1 where t is the iteration count. At iteration 800, we stopped the noise in order to observe the quality of the configuration of particles. |