Neural Likelihood Approximation for Integer Valued Time Series Data
Authors: Luke O'Loughlin, Andrew J. Black, John Maclean
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our method by performing inference on a number of ecological and epidemiological models, showing that we can accurately approximate the true posterior while achieving significant computational speed ups compared to current best methods. ... We evaluate our methods on simulated data from three different models, comparing the results to the exact sampling method PMMH (Andrieu et al., 2010). |
| Researcher Affiliation | Academia | Luke O Loughlin EMAIL Department of Mathemaical Sciences University of Adelaide John Maclean EMAIL Department of Mathemaical Sciences University of Adelaide Andrew Black EMAIL Department of Mathemaical Sciences University of Adelaide |
| Pseudocode | Yes | Algorithm 1 Evaluation of Nψ(y1:n, θ) Input: Time series with initial value y0:n, first CNN layer ℓ, r residual blocks ℓ(1) res, . . . , ℓ(r) res, final CNN layer ℓf, neural network parameters ψ. Output: Sequence of shift, scale and mixture proportions for a mixture of c discretised logistic conditionals. 1: Calculate context c W 2 c σ(W 1 c θ + bc). 2: Apply first layer z1:n ℓ(y1:n, c). 3: For k = 1, . . . , r do 4: Apply the kth residual block z1:n ℓ(k) res(z1:n, c). 5: Apply the final layer o1:n ℓf(z1:n, c). 6: Calculate the logistic parameters. 7: For i = 2, . . . , n do 8: For j = 1, . . . , c do 9: Calculate the shifts µj i oj i 1 + yi 1. 10: Calculate the scales sj i softplus(oj+c i 1) + 10 6. 11: Calculate the mixture proportions wi,j exp(oj+2c i 1 ). 12: Return µ2:n, s2:n, w2:n. |
| Open Source Code | No | The paper does not contain an explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | No | We evaluate our methods on simulated data from three different models... To simulate the data, we use parameters b = 0.26, d1 = 0.1, d2 = 0.01, p1 = 0.13, p2 = 0.05. We use informative priors on b, d1 and d2... The paper states the data is simulated by the authors, but does not provide concrete access information (link, repository, citation) for this simulated data. |
| Dataset Splits | Yes | For all experiments, we used 90% of the data for training and the other 10% for validation. |
| Hardware Specification | Yes | All experiments were performed using a single NVIDIA RTX 3060TI GPU, which provides a significant speed up over a CPU due to the use of convolutions. |
| Software Dependencies | Yes | All neural network models were constructed using JAX (Bradbury et al., 2018) and the MCMC was run using numpyro (Phan et al., 2019). |
| Experiment Setup | Yes | We used a kernel length of 5 for all experiments, and since all observations have relatively low to no noise, this was sufficient to ensure that p(yi|yi m:i 1, θ) p(yi|y1:i 1, θ) for all experiments. We used 5 mixture components in all experiments. ... For the SIR/SEIAR model experiments, we used 64 hidden channels and 2/3 residual blocks. For the predator-prey model experiments, we used 100 hidden channels and 3 residual blocks. ... We used Adam with weight decay (Loshchilov & Hutter, 2017) as the optimiser, using a weight decay parameter of 10 5 and a learning rate of 0.0003. The only other form of regularisation which we used was early stopping, using a patience parameter of 30 for all experiments. For the SIR/SEIAR/predator-prey model experiments, we used a batch size of 1024/256/512 to calculate the stochastic gradients. |