reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ensemble Learning for Relational Data

Authors: Hoda Eldardiry, Jennifer Neville, Ryan A. Rossi

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the ensemble method on both synthetic and real world data. Furthermore, we demonstrate the eﬀectiveness of the proposed methods for two diﬀerent network settings: (i) the single graph setting (Section 5.1) and (ii) the multi-graph setting where there are multiple graphs available with diﬀerent link types (Section 5.2).
Researcher Affiliation	Collaboration	Hoda Eldardiry EMAIL Department of Computer Science Virginia Tech... Jennifer Neville EMAIL Department of Computer Science and Department of Statistics Purdue University... Ryan A. Rossi EMAIL Adobe Research...
Pseudocode	Yes	Algorithm 1 Ensemble Learning: EL(Gtr =(Vtr, Etr), m)... Algorithm 2 Relational Subgraph Resampling: RSR(G = (V, E), b)... Algorithm 3 Collective Ensemble Classiﬁcation (CEC)
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	Synthetic data sets are generated with a latent group model Neville and Jensen (2005)... The Facebook dataset used in this work is a sample of Purdue University Facebook network... The second data set is from IMDb (Internet Movie Database)...
Dataset Splits	Yes	The 10 trials are repeated for 4 training and test pairs... The robustness of the methods to missing labels (in the test set) is evaluated by varying the proportion of labeled test data at 10% through 90%.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or specific computer configurations) used for running its experiments.
Software Dependencies	No	For the experiments in Section 5, we use relational dependency network (RDN) Neville and Jensen (2007) models as the component collective classiﬁcation models... No specific software versions are mentioned for RDNs or any other libraries.
Experiment Setup	Yes	The RSR algorithm uses a subgraph size b = 50 and b = 10 for the synthetic and Facebook experiment, respectively... using 450 500 Gibbs iterations for collective inference... we construct 5 bootstrap pseudosamples and learn the ensemble models (i.e., m = 5).