reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Virtual-Event-Based Posterior Sampling and Inference for Neyman-Scott Processes

Authors: Chengkuan Hong, Christian R. Shelton, Jun Zhu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that the prediction based on our sampling and inference algorithms for NSPs can achieve good prediction performance compared with state-of-the-art methods. Keywords: Markov chain Monte Carlo, variational inference, point processes, hierarchical model, Neyman-Scott processes
Researcher Affiliation	Academia	Chengkuan Hong EMAIL Department of Computer Science and Technology, BNRist Center Tsinghua Institute for AI, Tsinghua-Bosch Joint Center for ML Tsinghua University Beijing 10084, China, Christian R. Shelton EMAIL Department of Computer Science and Engineering University of California Riverside, CA 92521, USA, Jun Zhu EMAIL Department of Computer Science and Technology, BNRist Center Tsinghua Institute for AI, Tsinghua-Bosch Joint Center for ML Tsinghua University Beijing 10084, China
Pseudocode	Yes	Algorithm 1 Inference for NSPs Input: data x and model M. Initialization: parameters for the model Θ0, parameters for the VPPs Θ0, number of samples for each iteration S, initial sample for RPPs z(0,S) = {z(0,S) 1 , , z(0,S) L }, initial sample for VPPs z(0,S) = { z(0,S) 1 , , z(0,S) L }, and iterations N. Output: ΘN Θ , ΘN Θ, Algorithm 2 Re-sample, Algorithm 3 Flip, Algorithm 4 Swap
Open Source Code	No	The text does not contain an explicit statement about releasing code or a link to a code repository for the methodology described in this paper.
Open Datasets	Yes	The real-world datasets we use are retweets (Zhao et al., 2015), earthquakes (NCEDC, 2014; BDSN, 2014; HRSN, 2014; BARD, 2014), and homicides (COC, 2022) as in Hong and Shelton (2022, 2023). ... The data for the earthquakes came from the High Resolution Seismic Network (HRSN), doi:10.7932/HRSN; the Berkeley Digital Seismic Network (BDSN), doi:10/7932/BDSN; and the Bay Area Regional Deformation Network (BARD), doi:10.7932/BARD, all operated by the UC Berkeley Seismological Laboratory and archived at the Northern California Earthquake Data center (NCEDC), doi: 10/7932/NCEDC. ... COC. City of Chicago, Crimes 2001 to present. https://data.cityofchicago.org/Pub lic-Safety/Crimes-2001-to-Present/ijzp-q8t2, 2022. Accessed: 2022-08-14.
Dataset Splits	No	We split each dataset into training, validation, and test sets. The training sets are used to train the model parameters and the variational parameters. The validation sets are used to stop early. The test sets are used to calculate the metrics for performance comparison.
Hardware Specification	Yes	Table 7: Training time for the retweet dataset... Hardware UNSP USAP... 2 cores from an Intel Xeon Silver 4214 CPU @ 2.20GHz 1 NVIDIA Ge Force RTX 2080 Ti... 1 Intel Xeon Gold 6330 CPU @ 2.00GHz shared with other programs 1 NVIDIA Ge Force RTX 4090... Table 8: Training time for the earthquake dataset... Hardware UNSP USAP... 1 AMD EPYC 7763 64-Core Processor shared with other programs 1 NVIDIA Ge Force RTX 3090... Table 9: Training time for the homicide dataset... Hardware UNSP USAP... 1 Intel Xeon Silver 4214R CPU @ 2.40GHz shared with other programs 1 NVIDIA A40
Software Dependencies	No	The paper discusses various models and techniques, including mentions of 'deep neural networks' and 'graph neural networks', and refers to 'PyTorch' in previous work by the authors (Hong and Shelton, 2022, 2023) but does not provide specific version numbers for any software used in its own implementation for experiments. It mentions other software like 'CPLEX', 'Gecode', 'Choco' as examples in a general context, not as dependencies for this paper's methods.
Experiment Setup	Yes	As described in Hong and Shelton (2022, 2023), the prediction is performed by utilizing the samples from the approximate posterior distribution (MCMC or virtual point processes) of the hidden points. The procedure of the prediction is brieﬂy described in Algorithm 5. We adopt the Weibull kernel as in Example 2. ... Initialization: the model parameters Θ (the kernel parameters θ are from the training process, and the constant intensity functions {µk}KL k=1 are set as the mean of the constant intensity functions for all training sequences), the variational parameters Θ, the number of iterations for MCEM C (C can be set as 1 in practice), and the sample size S.