Greedy inference with structure-exploiting lazy maps
Authors: Michael Brennan, Daniele Bigoni, Olivier Zahm, Alessio Spantini, Youssef Marzouk
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Numerical examples We present numerical demonstrations of the lazy framework as follows. We first illustrate Algorithm 2 on a 2-dimensional toy example, where we show the progressive Gaussianization of the posterior using a sequence of 1-dimensional lazy maps. We then demonstrate the benefits of the lazy framework (Algorithms 1 and 2) in several challenging inference problems. We consider Bayesian logistic regression and a Bayesian neural network, and compare the performance of a baseline transport map to lazy maps using the same underlying transport class. We measure performance improvements in four ways: (1) the final ELBO achieved by the transport maps after training; (2 and 3): the final trace diagnostics 1 2 Tr(HB ℓ) and 1 2 Tr(Hℓ), which bound the error DKL(π||(Tℓ) ρ); and (4) the variance diagnostic 1 2Vρ[log ρ/T ℓπ], which is an asymptotic approximation of DKL((Tℓ) ρ||π) as (Tℓ) ρ π (see [40]). Finally, we highlight the advantages of greedily training lazy maps in a nonlinear problem defined by a high-dimensional elliptic partial differential equation (PDE), often used for testing high-dimensional inference methods [4, 16, 53]. |
| Researcher Affiliation | Academia | Michael C. Brennan Massachusetts Institute of Technology Cambridge, MA 02139 USA EMAIL Daniele Bigoni Massachusetts Institute of Technology Cambridge, MA 02139 USA EMAIL Olivier Zahm Université Grenoble Alpes, INRIA, CNRS, LJK 38000 Grenoble, France EMAIL Alessio Spantini Massachusetts Institute of Technology Cambridge, MA 02139 USA EMAIL Youssef Marzouk Massachusetts Institute of Technology Cambridge, MA 02139 USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Construction of a lazy map. [...] Algorithm 2 Construction of a deeply lazy map |
| Open Source Code | Yes | 4Code for the numerical examples can be found at https://github.com/Michael CBrennan/lazymaps and http://bit.ly/2Qlel XF. |
| Open Datasets | Yes | We consider a high-dimensional Bayesian logistic regression problem using the UCI Parkinson s disease classification data [1], studied in [49]. [...] UCI yacht hydrodynamics data set [2]. [...] Data for 4.4, G.4, and G.5 can be downloaded at http://bit.ly/2X09Ns8, http://bit.ly/2Hyt Qc0 and http://bit.ly/2Eug5ZR. |
| Dataset Splits | No | The paper mentions using specific datasets but does not provide explicit details on training, validation, or test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning the software frameworks used. |
| Software Dependencies | No | The paper mentions software like 'Transport Maps framework [7]', 'Tensor Flow probability library [19]', 'FEni CS [37]', and 'dolfin-adjoint [22]' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We consider degree 3 polynomial maps as the underlying transport class. We use Gauss quadrature rules of order 10 for the discretization of the KL divergence and the approximation of HB ℓ(m = 121 in Algorithm 3 and 5). [...] We choose a relatively uninformative prior of N(0, 102Id). [...] In G3-IAF, each layer has rank r = 200. [...] Our inference problem is 581-dimensional, given a network input dimension of 6, one hidden layer of dimension 20, and an output layer of dimension 1. We use sigmoid activations in the input and hidden layer, and a linear output layer. Model parameters are endowed with independent Gaussian priors with zero mean and variance 100. [...] Expectations appearing in the algorithm are discretized with m = 500 Monte Carlo samples. To not waste work in the early iterations, we use affine maps of rank r = 4 for iterations ℓ= 1, . . . , 5. Then we switch to polynomial maps of degree 2 and rank r = 2 for the remaining iterations. |