reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Space-Time Partitioning by Sampling and Pruning Spanning Trees

Authors: Leonardo V. Teixeira, Renato M. Assunção, Rosangela H. Loschi

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5 we present a simulation study comparing the proposed model with some other well-known methods for regionalization in the spatial context. The analysis of some real data from Brazil is carried out in Section 6.
Researcher Affiliation	Academia	Leonardo V. Teixeira EMAIL Department of Computer Science Purdue University West Lafayette, IN 47907, USA Renato M. Assunção EMAIL Departamento de Ciência da Computação Universidade Federal de Minas Gerais Belo Horizonte, Brazil Rosangela H. Loschi EMAIL Departamento de Estatística Universidade Federal de Minas Gerais Belo Horizonte, Brazil
Pseudocode	No	The paper describes the steps of the Gibbs sampler algorithm in text and mathematical formulas across Sections 3 and 4, particularly in '4.1. Sampling a Partition Compatible with the Current Tree' and '4.2. Sampling a tree compatible with the current partition', but does not feature a dedicated, structured pseudocode block or algorithm box.
Open Source Code	No	Our algorithm was implemented in C++ and is available upon request.
Open Datasets	Yes	To illustrate the purely spatial case, we partition the Brazilian map based on the Human Development Index (HDI) and the Brazilian South region based on bladder and lung cancer mortality rates. The space-time case is illustrated with HDI in three decades. ... Data are available in the DATASUS website (http://datasus.saude.gov.br/). We also obtained demographic information of the same years, for the same age groups and gender, from IBGE.
Dataset Splits	No	The paper describes the generation of simulated data in Section 5 ('Analysis of Simulated Data Sets') and the use of real-world data (HDI, cancer mortality) in Section 6 ('Case Studies'). For both simulated and real data, the focus is on regionalization and cluster identification, and the evaluation metrics compare the model's output to ground truth. However, there is no mention of explicit train/test/validation splits, nor is there a description of methodology for partitioning the datasets for model training and subsequent evaluation in a supervised learning sense.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or computing infrastructure used for running the experiments.
Software Dependencies	Yes	For Poisson data sets, the proposed spatial PPM is also compared to the BDCD model, implemented in C++ code (available at http://www.statistik.lmu.de/sfb386/software/bdcd/download.html) and to BPM, implemented in R software by ourselves. ... The Python Cluster Py2 library (Duque et al., 2011). ... Cluster Py: Library of spatially constrained clustering algorithms, Version 0.9.9.
Experiment Setup	Yes	For the MCMC we run a chain of 5000 iterations, skipping the ﬁrst 500 samples as a burn-in period and, to avoid correlation, we use a thinning of 5 simulated values. ... normal-gamma prior distribution with µGk \| τGk Normal m, [vτGk] 1 and τGk Gamma (a, b), where m = 0.65, v = 1, a = 400 and b = 1. For the Poisson data, Yi \| φGk iid Poisson (Ei φGk) and φGk Gamma(a, b), where a = b = 2. ... we assume that each edge is removed from the spanning tree with probability ρ Beta(5, 1000).