reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Predictive Learning on Hidden Tree-Structured Ising Models

Authors: Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide empirical results based on synthetic data to illustrate the probability of error δ as function of the cross-over probability q and the number of samples n. For the simulations of this paper the original tree structure T is generated randomly where, starting from the root, we choose the parent of each new node uniformly at random among the nodes that are currently in the tree, in a sequential fashion. First, we estimate the probability of error P TCL = T (named as δ) of the structure learning problem, Figure 3. For the structure learning experiments, the number of nodes is 100, β = arctanh(0.8), and α = arctanh(0.2). Further, we considering 100 Monte Carlo runs for averaging, and we plot the estimated probability of incorrect structure recovery while q and n vary. As a next step, we would like to see how well the theoretical bound of Theorem 5 matches with the experimental results. To do this we plot the top view of Figure 3 to get Figure 1. Quite remarkably, the theoretical and experimental bounds exactly match. The latter suggests that our theoretical bound that we derive, sample complexity of the Chow-Liu algorithm (Theorem 5), is indeed accurate.
Researcher Affiliation	Academia	Konstantinos E. Nikolakakis EMAIL Department of Electrical & Computer Engineering Rutgers, The State University of New Jersey 94 Brett Road, Piscataway, NJ 08854, USA Dionysios S. Kalogerias EMAIL Department of Electrical & Computer Engineering Michigan State University 428 S. Shaw Lane, MI 48824, USA Anand D. Sarwate EMAIL Department of Electrical & Computer Engineering Rutgers, The State University of New Jersey 94 Brett Road, Piscataway, NJ 08854, USA
Pseudocode	Yes	Algorithm 1 Chow Liu Require: D = y(1), y(2), . . . , y(n) { 1, 1}p n, where y(k) is the kth observation of Y 1: Compute ˆµ i,j 1 n Pn k=1 y(k) i y(k) j , for all i, j V 2: return TCL Maximum Spanning Tree i =j n ˆµ i,j o
Open Source Code	Yes	Figure 1: The simulation corresponds to structure learning. Comparison of the experimental results (heat-map) and the theoretical bound of Theorem 5, the bound that yields (1). The colored regions denote diﬀerent values of the estimated probability of error δ (at least one edge has been missed). The value of δ varies between 0 and 1 while the parameters α = 0.2, β = 1.1, p = 100 are ﬁxed. The red line shows the bound from Theorem 5 (the explicit form of Theorem 1). The code of the experiment is available at https://github.com/Konstantinos Nikolakakis/ Structure-Learning.
Open Datasets	No	We provide empirical results based on synthetic data to illustrate the probability of error δ as function of the cross-over probability q and the number of samples n. For the simulations of this paper the original tree structure T is generated randomly where, starting from the root, we choose the parent of each new node uniformly at random among the nodes that are currently in the tree, in a sequential fashion.
Dataset Splits	No	The paper uses synthetic data generated for its simulations. It specifies parameters for generating this data (e.g., number of nodes, alpha, beta values, Monte Carlo runs) rather than using predefined training/test/validation splits from a fixed dataset.
Hardware Specification	No	The paper describes simulations and experimental results but does not specify any particular hardware used for running these experiments, such as CPU or GPU models, or cloud computing resources.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation or analysis.
Experiment Setup	Yes	For the structure learning experiments, the number of nodes is 100, β = arctanh(0.8), and α = arctanh(0.2). Further, we considering 100 Monte Carlo runs for averaging, and we plot the estimated probability of incorrect structure recovery while q and n vary. Second, we plot the probability of error for the predictive learning task, that is the probability of the ss TV to be greater than a positive number η (Figure 4). For the simulation part, we restrict our attention to the case that the ﬁrst of three terms in the maximization of (41) is the dominant. In fact, η = 0.03, p = 31, while α and β are the same as the structure learning. Finally, Figure 5 presents the ss TV itself for diﬀerent values of q and n.