reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Mixed Latent Tree Models

Authors: Can Zhou, Xiaofei Wang, Jianhua Guo

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the simulated and real data support that our method is valid for mining the hierarchical structure and latent information. In this section, we performed numerical experiments on both simulated and real data sets.
Researcher Affiliation	Academia	Can Zhou EMAIL Xiaofei Wang EMAIL Jianhua Guo EMAIL KLAS and School of Mathematics and Statistics Northeast Normal University Changchun, China.
Pseudocode	Yes	Algorithm 1: Structural Learning for Latent Trees (SLLT) Input : Observed variables V and information distances duv for any u, v V; Output: A tree structure T; ... Algorithm 2: Parameter Estimation for Mixed Latent Trees (PEMT) Input : A latent tree with a root, the ﬁrst order moments EXu for u V, the second order moments EX2 u for u Vc and the moment matrices Euvw and Euv for u, v, w V. Output: All conditional probability matrices on edges in T.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the methodology, nor does it include a direct link to a code repository.
Open Datasets	Yes	In this part, we applied the SLLT algorithm to a Forest Cover Type dataset, which is available from the University of California, Irvine (UCI) machine learning data set repository. ... For more details about ELU, please visit the following link: https://archive.ics.uci.edu/ml/datasets/Covertype . For more details about basic soil components, please visit the following link: https://casoilresource.lawr.ucdavis.edu/soil_web/ssurgo.php?action=list_mapunits& areasymbol=co645
Dataset Splits	No	The paper describes generating synthetic data and using a real-world dataset (Forest Cover Type) but does not specify any training, testing, or validation splits for either. The experiments focus on structural learning and parameter estimation consistency rather than performance on predefined data splits.
Hardware Specification	Yes	All of the experiments were performed using R on a desktop with an Intel Core i5-3470 CPU 3.2 GHz and 16 GB RAM.
Software Dependencies	No	All of the experiments were performed using R on a desktop with an Intel Core i5-3470 CPU 3.2 GHz and 16 GB RAM. The paper mentions "R" but does not provide specific version numbers for R or any libraries used.
Experiment Setup	Yes	In the structural learning simulation, we started the value of the threshold ε from 0.1 and let it increase with the step size 0.1 until the SLLT algorithm obtained a tree. We set τ1 = 3 and τ2 = 5. ... We ran the EM with random initialization and 100 iterations.