Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Mini-batching error and adaptive Langevin dynamics

Authors: Inass Sekkat, Gabriel Stoltz

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We numerically verify that the reduction in the bias is proportional to the quality of the approximation of the covariance matrix on a basis of functions (e.g. constant matrices, piecewise constant scalar functions, ...). The paper is organized as follows. We start in Section 2 by reviewing various results related to the numerical analysis of SDEs and the quantification of the bias on the invariant measure sampled by SGLD and Langevin dynamics with stochastic gradient estimators. We next turn to Ad L in Section 3: we illustrate that some residual bias remains present due to the fact that the covariance of the gradient estimator is not constant in general, and we quantify it. Section 4 is dedicated to the introduction of an extended version of Ad L that allows to accomodate non constant covariance matrices for the gradient estimator and further reduce the bias. Some conclusions and perspectives are gathered in Section 5.
Researcher Affiliation Academia Sekkat Inass EMAIL CERMICS, Ecole des Ponts, Marne-la-Vall ee, France Gabriel Stoltz EMAIL CERMICS, Ecole des Ponts, Marne-la-Vall ee, France MATHERIALS team-project, Inria Paris, France
Pseudocode No The paper describes numerical schemes like (18), (30), (58), and (75) in mathematical notation, detailing the iterative updates for variables. While these descriptions provide step-by-step procedures, they are presented as mathematical equations for numerical integrators rather than explicitly labeled pseudocode or algorithm blocks. For example, scheme (58) is introduced as: 'Fixing Γ = γId, the numerical scheme reads as follows: ...' but is not formatted with an 'Algorithm' label or typical pseudocode syntax.
Open Source Code No The authors thank Ben Leimkuhler and Tiffany Vlaar for stimulating discussions on Ad L and Matthias Sachs for kindly providing the code used to perform numerical experiments of Section 3.4. I. Sekkat gratefully acknowledges financial support from Universit e Mohammed VI Polytechnique.
Open Datasets Yes We consider a subset of the MNIST data set containing the digits 7 and 9, and which have been pre-processed by a principal component analysis, as described in Section 4.3 of Leimkuhler et al. (2020).
Dataset Splits No The paper mentions using a subset of the MNIST dataset and a synthetic dataset, but does not provide specific training/test/validation splits. For the synthetic dataset, it states: 'The total number of data points is Ndata = 500 (250 points in each class); see Figure 11.' However, it does not specify how these 500 points are partitioned into training, validation, or test sets for experiments.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments. There are no mentions of specific GPU models, CPU models, or other computing resources.
Software Dependencies No The paper does not provide specific version numbers for any software components or libraries used in the experiments. It only mentions that code for numerical experiments in Section 3.4 was provided by Matthias Sachs, but no details on the environment.
Experiment Setup Yes To perform the numerical experiments, we generate a data set of Ndata = 100 according to a Gaussian distribution with mean θ0 = 0 and variance σx = 1. We also set σθ = 1 in the prior distribution. We run the SGLD scheme (18) and Langevin dynamics with the numerical scheme (30) with Γ = 1 for a final time T = 106 and various values of t (which corresponds to Niter = T/ t time steps). We also consider various values of n, the subsampling of the data points being done with and without replacement.