reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mode Estimation for High Dimensional Discrete Tree Graphical Models

Authors: Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao

NeurIPS 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate the accuracy and efﬁciency of our algorithm. More theoretical guarantee of our algorithm can be found in [7]. To validate our method, we ﬁrst show the scalability and accuracy of our algorithm in synthetic data. Furthermore, we demonstrate using biological data how modes can be used as a novel analysis tool.
Researcher Affiliation	Academia	Chao Chen Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 EMAIL Han Liu Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 EMAIL Dimitris N. Metaxas Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 EMAIL Tianqi Zhao Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 EMAIL
Pseudocode	Yes	Procedure 1 Compute-M-Modes Input: A tree G, a potential function f and a scale δ Output: The M modes of the lowest potential 1: Construct geodesic balls B = {Br(c) \| c V}, where r = δ 2 + 1 2: for all B B do 3: Mδ B = the set of local modes of B 4: Construct a junction tree (Figure 2). The label set of each supernode is its local modes. 5: Compute the M lowest-potential labelings of the junction tree, using Nilsson s algorithm.
Open Source Code	No	The paper does not provide any link to source code nor states its public availability for the described methodology.
Open Datasets	Yes	Biological data analysis. We compute modes of the microarray data of Arabidopsis thaliana plant (108 samples, 39 dimensions) [24].
Dataset Splits	No	The paper mentions generating synthetic data and using different sample sizes (10K, 40K, 80K) for evaluation, and also uses a biological dataset, but it does not specify explicit training, validation, or test dataset splits.
Hardware Specification	No	The paper discusses running time and scalability, but does not provide any specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies	No	The paper describes algorithms and methods but does not list any specific software dependencies with version numbers.
Experiment Setup	Yes	In all experiments, we choose M to be 500. We randomly generate tree-structured graphical model (tree size D =200 ...2000, label size L = 3) and test the speed. We randomly generate tree-structured distributions (D = 20, L = 2). To evaluate the sensitivity of our method to noise, we randomly ﬂip 0%, 5%, 10%, 15% and 20% labels of these samples.