Bayesian Space-Time Partitioning by Sampling and Pruning Spanning Trees

Authors: Leonardo V. Teixeira, Renato M. Assunção, Rosangela H. Loschi

JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5 we present a simulation study comparing the proposed model with some other well-known methods for regionalization in the spatial context. The analysis of some real data from Brazil is carried out in Section 6.
Researcher Affiliation Academia Leonardo V. Teixeira EMAIL Department of Computer Science Purdue University West Lafayette, IN 47907, USA Renato M. Assunção EMAIL Departamento de Ciência da Computação Universidade Federal de Minas Gerais Belo Horizonte, Brazil Rosangela H. Loschi EMAIL Departamento de Estatística Universidade Federal de Minas Gerais Belo Horizonte, Brazil
Pseudocode No The paper describes the steps of the Gibbs sampler algorithm in text and mathematical formulas across Sections 3 and 4, particularly in '4.1. Sampling a Partition Compatible with the Current Tree' and '4.2. Sampling a tree compatible with the current partition', but does not feature a dedicated, structured pseudocode block or algorithm box.
Open Source Code No Our algorithm was implemented in C++ and is available upon request.
Open Datasets Yes To illustrate the purely spatial case, we partition the Brazilian map based on the Human Development Index (HDI) and the Brazilian South region based on bladder and lung cancer mortality rates. The space-time case is illustrated with HDI in three decades. ... Data are available in the DATASUS website (http://datasus.saude.gov.br/). We also obtained demographic information of the same years, for the same age groups and gender, from IBGE.
Dataset Splits No The paper describes the generation of simulated data in Section 5 ('Analysis of Simulated Data Sets') and the use of real-world data (HDI, cancer mortality) in Section 6 ('Case Studies'). For both simulated and real data, the focus is on regionalization and cluster identification, and the evaluation metrics compare the model's output to ground truth. However, there is no mention of explicit train/test/validation splits, nor is there a description of methodology for partitioning the datasets for model training and subsequent evaluation in a supervised learning sense.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or computing infrastructure used for running the experiments.
Software Dependencies Yes For Poisson data sets, the proposed spatial PPM is also compared to the BDCD model, implemented in C++ code (available at http://www.statistik.lmu.de/sfb386/software/bdcd/download.html) and to BPM, implemented in R software by ourselves. ... The Python Cluster Py2 library (Duque et al., 2011). ... Cluster Py: Library of spatially constrained clustering algorithms, Version 0.9.9.
Experiment Setup Yes For the MCMC we run a chain of 5000 iterations, skipping the first 500 samples as a burn-in period and, to avoid correlation, we use a thinning of 5 simulated values. ... normal-gamma prior distribution with µGk | τGk Normal m, [vτGk] 1 and τGk Gamma (a, b), where m = 0.65, v = 1, a = 400 and b = 1. For the Poisson data, Yi | φGk iid Poisson (Ei φGk) and φGk Gamma(a, b), where a = b = 2. ... we assume that each edge is removed from the spanning tree with probability ρ Beta(5, 1000).