Optimal Scaling for the Proximal Langevin Algorithm in High Dimensions

Authors: Natesh S. Pillai

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we show that for a wide class of twice differentiable target densities, the proximal MALA enjoys the same optimal scaling as that of MALA in high dimensions and also has an average optimal acceptance probability of 0.574. The results of our paper thus give the following practically useful guideline: for smooth target densities where the gradient is expensive to compute or numerically unstable while implementing MALA, users may replace the gradient with the corresponding proximal function (that can be often computed relatively cheaply via convex optimization) without losing any efficiency gains from optimal scaling. We show this for two class of examples. First, for the product of Gaussians, we identify the optimal scale for proximal MALA and show that it is identical to MALA. Next, following the exact framework used in Pillai et al. (2012), we define a version of the proximal MALA algorithm in a Hilbert space. We show that for a certain class of twice differentiable, infinite dimensional non-product measures commonly used in applications, the proximal MALA applied to an N dimensional approximation of the target also will take O(N 1 3 ) steps to explore the invariant measure, with an optimal acceptance probability of 0.574.
Researcher Affiliation Academia Natesh S. Pillai EMAIL Department of Statistics Harvard University MA 02138, USA
Pseudocode No The paper describes algorithms using mathematical equations and descriptive text, but no explicit pseudocode or algorithm block is provided.
Open Source Code No The paper does not contain any explicit statements about releasing source code, nor does it provide links to any code repositories or supplementary materials containing code.
Open Datasets No The paper uses theoretical constructs such as 'product of standard Gaussians' and 'infinite dimensional non-product measures' for its analysis. While a 'Poisson regression model' is mentioned for an illustrative example in Figure 1, no concrete dataset is provided with access information (link, DOI, citation, or repository) for empirical validation.
Dataset Splits No The paper does not conduct experiments on specific datasets that would require train/test/validation splits. Therefore, no dataset split information is provided.
Hardware Specification No The paper focuses on theoretical analysis and mathematical proofs; it does not describe any experimental setup or computations requiring specific hardware specifications.
Software Dependencies No The paper mentions 'MATHEMATICA' was used for a derivation step ('We used MATHEMATICA for obtaining this expansion'), but it does not specify any software dependencies (e.g., programming languages, libraries, or solvers with version numbers) required to implement or reproduce the algorithms and results described in the context of experiments.
Experiment Setup No The paper is theoretical in nature, presenting mathematical analyses and proofs. It does not describe an experimental setup with specific hyperparameters, training configurations, or system-level settings typically found in empirical studies.