reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Meta Learning for Support Recovery of High-Dimensional Ising Models

Authors: Huiming Xie, Jean Honorio

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments Synthetic Experiments. To help illustrate and validate our theories, we conduct a group of synthetic experiments and report the success rates (over 100 repetitions) for recovery of the true support union. We run the experiments with diﬀerent number of nodes p œ {50, 100, 200} and degree d = 3, as well as with p = 50 nodes and three diﬀerent degrees d œ {3, 5, 7}. We set the number of tasks scaling as K = d3 log p, with sample size for each auxiliary task n = Cd3 log p/K for C ranging from 1 to upwards of 200. Then based on the estimated support union using C = 200, we use diﬀerent sample sizes n(K+1) = CÕd3 log d for the novel task when CÕ changes from 1 to 200 and calculate the success rates (over 100 repetitions) for signed edge recovery of the novel task. We plot the success rates against C and CÕ for the two steps respectively in Figure 1 and Figure 2. The curves approximately lie on top of one another as the success rates tend to 1 in each step, as predicted by Theorem 4.7 and 4.9. Our results compare favorably against alternative methods. See Appendix B.1 for more details. Real-world Data Experiments. As another motivation and validation of our method, we used the real-world dataset 1000 Functional Connectomes at http://www.nitrc.org/projects/fcon_1000/ from 1128 subjects, 41 sites worldwide, and p = 157 brain regions.
Researcher Affiliation	Academia	Huiming Xie EMAIL Department of Statistics Purdue University Jean Honorio EMAIL School of Computing and Information Systems The University of Melbourne
Pseudocode	No	The paper describes methods verbally and mathematically but does not include any explicitly labeled pseudocode or algorithm blocks. For example, 'Our estimation procedure can be divided into two steps.' describes the process in paragraph form.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to any code repositories. It mentions Open Review for the paper itself, not for code.
Open Datasets	Yes	Real-world Data Experiments. As another motivation and validation of our method, we used the real-world dataset 1000 Functional Connectomes at http://www.nitrc.org/projects/fcon_1000/ from 1128 subjects, 41 sites worldwide, and p = 157 brain regions.
Dataset Splits	No	The paper describes how tasks are used (K auxiliary tasks and 1 novel task) and how sample sizes per task are set, but it does not specify traditional train/validation/test splits for samples within any given dataset. For example: "We estimated the support union from K = 40 auxiliary tasks... We used task 41 as the novel task" describes a task-level split, not a data sample split.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to run the experiments.
Software Dependencies	No	The paper does not mention any specific software libraries, frameworks, or their version numbers that were used for the implementation or experiments.
Experiment Setup	Yes	We set the number of tasks scaling as K = d3 log p, with sample size for each auxiliary task n = Cd3 log p/K for C ranging from 1 to upwards of 200. Then based on the estimated support union using C = 200, we use diﬀerent sample sizes n(K+1) = CÕd3 log d for the novel task when CÕ changes from 1 to 200 and calculate the success rates (over 100 repetitions) for signed edge recovery of the novel task.