reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Transfer learning for tensor Gaussian graphical models

Authors: Mingyang Ren, Yaoming Zhen, Junhui Wang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical experiments are conducted on both synthetic tensor graphs and brain functional connectivity network data, which demonstrates the satisfactory performance of the proposed method. Numerical simulations and the application on ADHD brain f MRI data and breast cancer gene interaction study are conducted in Sections 5 and 6, respectively.
Researcher Affiliation	Academia	Mingyang Ren EMAIL School of Mathematical Sciences Shanghai Jiao Tong University Minhang, Shanghai, China Yaoming Zhen EMAIL Department of Statistical Sciences University of Toronto Toronto, Ontario, Canada Junhui Wang EMAIL Department of Statistics The Chinese University of Hong Kong Shatin, Hong Kong, China
Pseudocode	Yes	The key to transfer learning is to construct a similarity measure between parameters of interest in the auxiliary and target domains... In view of the above discussion, for each mode, a multi-step method can be proposed to realize the transfer learning of tensor graphical models. Step 1. Initialization... Step 2. For each m [M], perform the following two estimation steps separately. (a). Estimate the divergence matrix of mode-m... (b). Estimate the precision matrix of mode-m... For Step 2(a)... [ b m](i,j) = sign([ b Bm](i,j)) max(0, \|[ b Bm](i,j)\| λ1)... For Step 2(b)... Particularly, at iteration t + 1, the updating formula of θi, i-th component of θ, with other components {θ(t+1) i , i < i; θ(t) i , i > i} ﬁxed, are θ(t+1) i = [bΣA m] 1 (i,i)T (ξ(t), λ2m I(i = j)), for i = 1, , pm, where ξ(t) = [ b m + Ipm](i,j) P i <i θ(t+1) i [bΣA m](i,i ) P i >i θ(t) i [bΣA m](i,i ).
Open Source Code	No	The paper does not contain an explicit statement about the release of source code for the proposed methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	The analyzed dataset is part of the ADHD-200 repository (Bellec et al., 2017), which is collected from seven sites... the processed data is publicly available at https://www.nitrc.org/plugins/mwiki/index.php/neurobureau:Athena Pipeline#Whole_Brain_Data. Readers may refer to Bellec et al. (2017) for more details about the raw brain imaging data. In this example, we consider the breast cancer gene expression data, which can be downloaded using R package brca.data (https://github.com/averissimo/brca.data/releases/download/1.0/brca.data_1.0.tar.gz).
Dataset Splits	Yes	To this end, we can ﬁrst randomly split the data from the target domain into two folds N and N C, such that N S N C = {1, , n} and card(N) = cn, for some fraction 0 < c < 1. As suggested in Li et al. (2022b), the value of c might not be sensitive in practice, and we thus set c = 0.6 in all of our numerical experiments. In addition to the proposed methods and Tlasso, we also consider a naive baseline by applying Trans CLIME (Li et al., 2022b) after ﬂattening the tensor into a vector. Note that the underlying true parameters of precision matrices are unavailable, so we use the negative log-likelihood based on ﬁve-fold cross-validation as an indicator to evaluate the performance of all competitors when a site is ﬁxed as the target domain. Speciﬁcally, samples of the target domain are randomly divided into ﬁve parts, one of which is used as the test sample to calculate the covariance matrices of all modes {bΣtest m }M m=1 and the rest is the training sample.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The R package Tlasso is mentioned, and the R package brca.data is mentioned, but no specific version numbers are provided for these or any other software dependencies.
Experiment Setup	Yes	For each target graph, we set M = 3 with dimensions (p1, p2, p3) = (10, 10, 20) or M = 2 with dimensions (p1, p2) = (100, 100), and set the size of the target graph as n = 50... As for the tuning pa-rameter selection, we set λ1m = 2 bΩ(0) m 1, qnp for mode-m, following Li et al. (2022b). For λ2m, it is suggested to be determined via minimizing a BIC-type criterion... We set c = 0.6 in all of our numerical experiments... we use the negative log-likelihood based on ﬁve-fold cross-validation as an indicator to evaluate the performance...