reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mean-Square Analysis of Discretized Itô Diffusions for Heavy-tailed Sampling

Authors: Ye He, Tyler Farghly, Krishnakumar Balasubramanian, Murat A. Erdogdu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Itˆo diﬀusions associated with weighted Poincar e inequalities. Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is ϵ close to the target distribution in the Wasserstein-2 metric. ... We also provide similar iteration complexity results for the case where only function evaluations of the unnormalized target density are available by estimating the gradients using a Gaussian smoothing technique. We provide illustrative examples based on the multivariate t-distribution. Section 7. Numerical Experiments: We now provide simulation results comparing the performance of the ULA and the Itˆo discretization in (10). We use the Python Optimal Transport package (Flamary et al., 2021) for the experiments. In our first set of experiments, we use the algorithms to sample from multivariate student-t targets with 4 degrees of freedom and the scaling matrix being the identity matrix. In Figure 2, we examine the cases of dimension being 10, 25 and 50 respectively from left to right.
Researcher Affiliation	Academia	Ye He EMAIL Department of Mathematics Georgia Institute of Technology Atlanta, GA 30332, USA Tyler Farghly EMAIL Department of Statistics University of Oxford Oxford, OX1 2JD, United Kingdom Krishnakumar Balasubramanian EMAIL Department of Statistics University of California Davis, CA 95616, USA Murat A. Erdogdu EMAIL Department of Computer Science & Department of Statistics University of Toronto Toronto, ON M5S 3G4, Canada
Pseudocode	No	The paper defines the Euler-Maruyama discretization in equation (10) as: 'xk+1 = xk h(β 1) V (xk) + p 2h V (xk)ξk+1'. It also defines the zeroth-order algorithm in equation (22). However, these are mathematical equations defining the algorithms, not structured pseudocode blocks or clearly labeled algorithm sections.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the described methodology, nor does it provide a direct link to a code repository. It mentions using 'the Python Optimal Transport package (Flamary et al., 2021)' but this is a third-party tool, not the authors' own code.
Open Datasets	No	The paper uses examples based on sampling from known theoretical distributions, specifically the 'multivariate t-distribution'. It does not refer to or use an empirical 'dataset' that would require public availability information in the context of this question.
Dataset Splits	No	The paper samples from theoretical distributions (e.g., multivariate t-distribution) rather than using empirical datasets. Therefore, the concept of training, test, or validation dataset splits is not applicable to the methodology described.
Hardware Specification	No	The paper describes experimental procedures such as running algorithms for a certain number of iterations with specified step-sizes, and the dimensions of the targets. However, it does not provide any specific details about the hardware used, such as GPU or CPU models, memory, or specific cloud computing instances.
Software Dependencies	No	The paper mentions using 'the Python Optimal Transport package (Flamary et al., 2021)' for numerical experiments. However, it does not specify the version number of this package or any other software dependencies, which is required for a reproducible description.
Experiment Setup	Yes	For the case of 10 and 25 dimensions, both algorithms are run for 100000 iterations in parallel for 100 times, with step-size set to 0.0001. For the case of 50 dimensions, the algorithm is run for 50000 iterations in parallel for 100 times, with step-size set to 0.0002. ... Both algorithms are run for 10000 iterations in parallel for 500 times, with step-size set to 0.001. The first row, second row and third row are with diﬀerent initializations, (10, 10), ( 16, 1) and (6, 6) respectively.