Neural Likelihood Approximation for Integer Valued Time Series Data

Authors: Luke O'Loughlin, Andrew J. Black, John Maclean

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our method by performing inference on a number of ecological and epidemiological models, showing that we can accurately approximate the true posterior while achieving significant computational speed ups compared to current best methods. ... We evaluate our methods on simulated data from three different models, comparing the results to the exact sampling method PMMH (Andrieu et al., 2010).
Researcher Affiliation Academia Luke O Loughlin EMAIL Department of Mathemaical Sciences University of Adelaide John Maclean EMAIL Department of Mathemaical Sciences University of Adelaide Andrew Black EMAIL Department of Mathemaical Sciences University of Adelaide
Pseudocode Yes Algorithm 1 Evaluation of Nψ(y1:n, θ) Input: Time series with initial value y0:n, first CNN layer ℓ, r residual blocks ℓ(1) res, . . . , ℓ(r) res, final CNN layer ℓf, neural network parameters ψ. Output: Sequence of shift, scale and mixture proportions for a mixture of c discretised logistic conditionals. 1: Calculate context c W 2 c σ(W 1 c θ + bc). 2: Apply first layer z1:n ℓ(y1:n, c). 3: For k = 1, . . . , r do 4: Apply the kth residual block z1:n ℓ(k) res(z1:n, c). 5: Apply the final layer o1:n ℓf(z1:n, c). 6: Calculate the logistic parameters. 7: For i = 2, . . . , n do 8: For j = 1, . . . , c do 9: Calculate the shifts µj i oj i 1 + yi 1. 10: Calculate the scales sj i softplus(oj+c i 1) + 10 6. 11: Calculate the mixture proportions wi,j exp(oj+2c i 1 ). 12: Return µ2:n, s2:n, w2:n.
Open Source Code No The paper does not contain an explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets No We evaluate our methods on simulated data from three different models... To simulate the data, we use parameters b = 0.26, d1 = 0.1, d2 = 0.01, p1 = 0.13, p2 = 0.05. We use informative priors on b, d1 and d2... The paper states the data is simulated by the authors, but does not provide concrete access information (link, repository, citation) for this simulated data.
Dataset Splits Yes For all experiments, we used 90% of the data for training and the other 10% for validation.
Hardware Specification Yes All experiments were performed using a single NVIDIA RTX 3060TI GPU, which provides a significant speed up over a CPU due to the use of convolutions.
Software Dependencies Yes All neural network models were constructed using JAX (Bradbury et al., 2018) and the MCMC was run using numpyro (Phan et al., 2019).
Experiment Setup Yes We used a kernel length of 5 for all experiments, and since all observations have relatively low to no noise, this was sufficient to ensure that p(yi|yi m:i 1, θ) p(yi|y1:i 1, θ) for all experiments. We used 5 mixture components in all experiments. ... For the SIR/SEIAR model experiments, we used 64 hidden channels and 2/3 residual blocks. For the predator-prey model experiments, we used 100 hidden channels and 3 residual blocks. ... We used Adam with weight decay (Loshchilov & Hutter, 2017) as the optimiser, using a weight decay parameter of 10 5 and a learning rate of 0.0003. The only other form of regularisation which we used was early stopping, using a patience parameter of 30 for all experiments. For the SIR/SEIAR/predator-prey model experiments, we used a batch size of 1024/256/512 to calculate the stochastic gradients.