reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Phylogenetic Inference with Products over Bipartitions

Authors: Evan Sidrow, Alexandre Bouchard-Cote, Lloyd T Elliott

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments on benchmark genomic datasets and an application to the viral RNA of SARS-Co V-2, we demonstrate that our method achieves competitive accuracy while requiring significantly fewer gradient evaluations than existing state-of-the-art techniques.
Researcher Affiliation	Academia	1Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada 2Department of Statistics, University of British Columbia, Vancouver, Canada. Correspondence to: Evan Sidrow <EMAIL>.
Pseudocode	Yes	Algorithm 1 Single-Linkage Clustering(T, X0) ... Algorithm 2 Sample-q(µ, σ, X) ... Algorithm 3 VIPR(X, K)
Open Source Code	Yes	Our code implementing VIPR is available at https://github.com/Evan Sidrow/VIPR.
Open Datasets	Yes	We studied eleven commonly used genetic datasets that are listed in Lakner et al. (2008) denoted DS1 through DS11... We also studied a dataset of 72 COVID-19 genomes obtained from GISAID (Global Initiative on Sharing All Influenza Data; Khare et al. 2021)... We simulated seven datasets using the ms software (Hudson, 2002).
Dataset Splits	No	The paper describes processing of MCMC chains for 'burn-in' and 'thinning' to obtain a 'gold standard' and 'subsplit support', and also mentions subsetting sites (e.g., 'We subset the genomes to M = 3,101 non-homologous sites'). However, it does not explicitly provide details about training/test/validation splits for machine learning experiments on the input genetic datasets.
Hardware Specification	Yes	Each run for the experiments on the DS1 to DS11 datasets was executed on a supercomputer node. The runs were allocated 12 hours of wallclock time, 1 CPU, and 16GB of RAM. The supercomputer had a heterogeneous infrastructure involving in which each CPU make and model was Intel v4 Broadwell, Intel Caskade Lake or Skylake, or AMD EPYC 7302. ... These experiments were run on a 2019 Macbook Pro with 16GB of RAM and a 2.6 GHz 6-core Intel i7 CPU.
Software Dependencies	No	The paper mentions several software tools such as MAFFT (Katoh and Standley, 2013), Adam (Kingma and Ba, 2014) implemented in Py Torch (Paszke et al., 2019), Autograd (Maclaurin et al., 2015), BEAST, ms software (Hudson, 2002), and matplotlib (Hunter 2007). However, it does not provide specific version numbers for any of these software dependencies, which are necessary for reproducible descriptions.
Experiment Setup	Yes	For all methods considered (BEAST, VBPI and our VIPR methods), we used a Kingman coalescent prior on the phylogenies. We fixed the effective population size at Ne = 5 (Kingman, 1982) and assumed the Jukes-Cantor model for mutation (Jukes and Cantor, 1969). ... To approximate the true posterior distribution of each dataset we ran 10 independent MCMC chains using BEAST, each with 10,000,000 iterations. We discarded the first 250,000 iterations as burn-in and thinned to every 1,000-th iteration. ... We used the Adam optimization algorithm implemented in Py Torch with four random restarts and learning rates of 0.003, 0.001, 0.0003, and 0.0001 ... For each gradient estimation technique, we used the Adam optimizer in Py Torch with ten random restarts and learning rates of 0.001, 0.003, 0.01, and 0.03. We recorded the estimated MLL every 10 iterations (i.e., parameter updates) with 50 Monte Carlo samples.