An asymptotic analysis of distributed nonparametric methods
Authors: Botond Szabó, Harry van Zanten
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To see that interesting things can happen it is exemplifying to compare the results of a distributed and a non-distributed (Bayesian) analysis of simulated data. Concretely, we consider a true signal θ consisting of the Fourier coefficients of the function shown in Figure 1. For this signal we simulate data according to (2.2), with σ = 1, m = 40 and n = 120 × 40 = 4800. Then for every local observer a Bayesian procedure is carried out with a Gaussian prior on θ, postulating that the coordinates θi are independent and N(0, i−1−2α)-distributed. The hyperparameter α, which describes the regularity of the prior, is determined using a distributed version of maximum marginal likelihood, as described in Section 4. This analysis leads to m = 40 local posterior distributions. These are then combined to produce an overall posterior distribution for the signal. The precise procedure is described in Section 4. The resulting estimator for the signal, together with pointwise 95% credible intervals, is shown in the left plot in Figure 2. The corresponding non-distributed result is obtained by first aggregating all local data as in (2.1) and then carrying out the same Bayesian procedure on these complete data. The resulting non-distributed reconstruction of the signal is shown on the right in Figure 2. ... Simulations further illustrate the theoretical results. |
| Researcher Affiliation | Academia | Botond Szab o EMAIL Mathematical Institute Leiden University 2333 CA Leiden The Netherlands Harry van Zanten EMAIL Korteweg-de Vries Institute for Mathematics University of Amsterdam Science Park 105-107 1098 XG Amsterdam The Netherlands |
| Pseudocode | No | The paper describes various statistical methods and their mathematical properties but does not contain explicit pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not provide any explicit statements about the availability of source code, nor does it include links to repositories or supplementary materials for code. |
| Open Datasets | No | The paper analyzes a 'distributed version of the classical signal-in-Gaussian-white-noise model' and states 'For this signal we simulate data according to (2.2)'. The data used in the paper's examples are simulated, not derived from a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes theoretical analysis and simulations based on a signal-in-Gaussian-white-noise model. It does not mention training, validation, or test dataset splits, as it does not involve machine learning model training on pre-existing datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to conduct its simulations or analyses, such as CPU/GPU models, memory, or specific computing environments. |
| Software Dependencies | No | The paper focuses on mathematical analysis and theoretical performance of statistical methods. It does not mention any specific software, libraries, or frameworks with version numbers that would be required to reproduce the work. |
| Experiment Setup | Yes | To see that interesting things can happen it is exemplifying to compare the results of a distributed and a non-distributed (Bayesian) analysis of simulated data. Concretely, we consider a true signal θ consisting of the Fourier coefficients of the function shown in Figure 1. For this signal we simulate data according to (2.2), with σ = 1, m = 40 and n = 120 × 40 = 4800. Then for every local observer a Bayesian procedure is carried out with a Gaussian prior on θ, postulating that the coordinates θi are independent and N(0, i−1−2α)-distributed. The hyperparameter α, which describes the regularity of the prior, is determined using a distributed version of maximum marginal likelihood, as described in Section 4. ... Simulations further illustrate the theoretical results. We have considered a true signal θ consisting of the Fourier coefficients of the function shown in the left panel of Figure 3. This is a signal which has regularity β = 1 in the sense of (3.3). For this signal we simulated data according to (2.2), with σ = 1, n = 4800 and m = 40, i.e. we considered a distributed setting with m = 40 machines. |