reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning-Augmented Hierarchical Clustering

Authors: Vladimir Braverman, Jon C. Ergun, Chen Wang, Samson Zhou

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our main results are algorithms that given learning-augmented oracles, compute efficient approximate HC trees for the celebrated Dasgupta s and Moseley-Wang objectives that overcome known hardness barriers. ... This paper presents work that advances the Machine Learning and Algorithm Design fields. Due to its theoretical nature, we do not see any immediate societal impacts
Researcher Affiliation	Collaboration	1Johns Hopkins University 2Rice University 3Google Research 4Carnegie Mellon University 5Texas A&M University. Correspondence to: Vladimir Braverman <EMAIL>, Jon C. Ergun <EMAIL>, Chen Wang <EMAIL>, Samson Zhou <EMAIL>.
Pseudocode	Yes	Algorithm 1 A polynomial-time algorithm for the Dasgupta s HC objective ... Algorithm 2 An O(n3) time algorithm for the Dasgupta s HC objective ... Algorithm 3 A near-linear time algorithm for the Moseley Wang HC objective ... Algorithm 4 A polynomial-time single-pass semi-streaming algorithm for the Dasgupta s HC objective ... Algorithm 5 A near-linear work, poly-logarithmic depth PRAM algorithm for the Moseley-Wang HC objective ... Algorithm 6 counterpart-tester-strong ... Algorithm 7 strong-partial-tree O(e V ) ... Algorithm 8 weak-partial-tree O(e V , VH) ... Algorithm 9 counterpart-tester(O,VH) ... Algorithm 10 predecessor-tester(O,VH) ... Algorithm 11 test-orphan-predecessor(O,VH,e V ) ... Algorithm 12 vertex-split O(e V , VH) ... Algorithm 13 tree-merge O(TT , Te V \T , u ).
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide any links to a code repository.
Open Datasets	No	The paper discusses hierarchical clustering with 'n data points' and refers to 'datasets' in a general, theoretical context (e.g., 'datasets often have additional auxiliary information'). However, it does not describe using any specific, named datasets for experiments or provide concrete access information for any dataset.
Dataset Splits	No	The paper is theoretical in nature and does not conduct empirical studies using specific datasets. Therefore, no information on training/test/validation dataset splits is provided.
Hardware Specification	No	The paper is theoretical and focuses on algorithm design and approximation guarantees. It does not describe any experimental hardware used.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design. It does not mention any specific software dependencies or versions used for implementation or experimentation.
Experiment Setup	No	The paper is theoretical and presents algorithms and their approximation guarantees. It does not include an experimental setup with hyperparameters or system-level training settings.