reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Frequentist Guarantees of Distributed (Non)-Bayesian Inference

Authors: Bohan Wu, César A. Uribe

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We establish frequentist properties, i.e., posterior consistency, asymptotic normality, and posterior contraction rates, for the distributed (non-)Bayesian inference problem for a set of agents connected over a network. These results are motivated by the need to analyze large, decentralized datasets, where distributed (non)-Bayesian inference has become a critical research area across multiple ﬁelds, including statistics, machine learning, and economics. Our results show that, under appropriate assumptions on the communication graph, distributed (non)-Bayesian inference retains parametric eﬃciency while enhancing robustness in uncertainty quantiﬁcation. We also explore the trade-oﬀbetween statistical eﬃciency and communication eﬃciency by examining how the design and size of the communication graph impact the posterior contraction rate. Furthermore, we extend our analysis to time-varying graphs and apply our results to exponential family models, distributed logistic regression, and decentralized detection models. ... Our work ﬁlls a crucial gap by establishing the Frequentist properties of such distributed Bayesian procedures, including posterior consistency, asymptotic normality, and posterior contraction rates.
Researcher Affiliation	Academia	Bohan Wu EMAIL Department of Statistics Columbia University New York, NY 10027, USA. C esar A. Uribe EMAIL Department of Electrical and Computer Engineering Rice University Houston, TX 77005, USA
Pseudocode	No	The paper describes mathematical update rules, such as Equation (2.9) for the mirror descent update, and refers to methods like the 'Langevin Monte Carlo (LMC) algorithm' for simulations. However, it does not provide any explicitly labeled pseudocode blocks or algorithm listings with structured steps.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to code repositories. While simulations are mentioned in Appendix B, no code access is provided.
Open Datasets	No	The paper focuses on theoretical analysis and uses simulated data for illustrative examples, such as those described in Sections 8.2 ('Distributed Logistic Regression', e.g., 'we simulate a static connected graph with 10 nodes, generating 500 data points under a logistic regression model') and 8.3 ('Distributed Detection', e.g., 'This target generates data X1 t , . . . , Xm t δθ0 at time t'). No specific public or open datasets are mentioned with access information.
Dataset Splits	No	The paper primarily focuses on theoretical developments and uses simulated data for illustrations. It does not mention any training, test, or validation dataset splits, as it does not conduct empirical experiments on existing datasets.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, processor types, or memory) used for running simulations or any other computational work described.
Software Dependencies	No	The paper mentions methods like the 'Langevin Monte Carlo (LMC) algorithm' and 'gradient-based optimization methods'. However, it does not specify any particular software libraries, programming languages, or their version numbers used for implementation.
Experiment Setup	No	The paper mentions generating 1,000 samples for histograms in Appendix B and using 'Newton-Raphson algorithm' for calculations in Section 8.3. However, it does not provide specific hyperparameters (e.g., learning rates, batch sizes, number of iterations, optimization algorithm configurations, or LMC parameters) or detailed training configurations for any computational aspect.