Learning Network Granger causality using Graph Prior Knowledge

Authors: Lucas Zoroddu, Pierre Humbert, Laurent Oudre

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we present experimental results derived from both synthetic and real-world datasets. These results clearly illustrate the advantages of our proposed method over existing alternatives, particularly in situations where few samples are available.
Researcher Affiliation Academia Lucas Zoroddu EMAIL Université Paris Saclay, Université Paris Cité, ENS Paris Saclay, CNRS, SSA, INSERM, Centre Borelli, F-91190, Gif-sur-Yvette, France; Pierre Humbert EMAIL Université Paris-Saclay, CNRS, Inria, Laboratoire de mathématiques d Orsay, F-91405, Orsay, France; Laurent Oudre EMAIL Université Paris Saclay, Université Paris Cité, ENS Paris Saclay, CNRS, SSA, INSERM, Centre Borelli, F-91190, Gif-sur-Yvette, France
Pseudocode Yes The final AALasso algorithm is presented in Algorithm 1.
Open Source Code No The code to reproduce our experiments with synthetic data will be available.
Open Datasets No Synthetic data are generated with respect to the statistical model (15). To define the matrix A , we first generate p = 40 points in [0, 1]2 uniformly at random. Then, we construct a matrix D Rp p by applying the Gaussian kernel to the pairwise Euclidean distance between the points (the standard deviation of the Gaussian kernel is taken equal to the median values of all pairwise distances). A is obtained by randomly setting to 0 a ratio τm = 0.5 of values of D (mispecified edges) and cutting to 0 values smaller than τ = 0.7 to promote sparsity. Finally, the VAR parameters are drawn from Laplace(0, A i,j) and Aprior = D + E, where E is a symmetric matrix whose subdiagonal values are sampled from i.i.d. Gaussian distribution with variance σ2 A (varying in {0.02, 0.1, 0.25, 0.35} to test several level of noises). Note that Aprior is computed from D and not from A . Indeed, A is obtained by performing sparse variable selection from D; therefore, we do not incorporate this variable selection into Aprior and the objective is to assess the effectiveness of AALasso in accurately retrieving this selection. This data generation is summarized in Algorithm (2). At the end, for each experiment, we sample N = {80, 200, . . . , 500} different time series X[t] N(C X[t 1], σ2 X), σ2 X = 0.1, which we split into training and test sets of equal sizes. We repeat this procedure 20 times for each value of N. The Heritage Provider Network DREAM 8 Breast Cancer Network Prediction dataset focuses on predicting causal protein networks using time series data from reverse phase protein array (RPPA) experiments. The Molène datase contains hourly temperatures recorded by sensors at p = 32 locations in Britany (France) during N = 746 hours.
Dataset Splits Yes At the end, for each experiment, we sample N = {80, 200, . . . , 500} different time series X[t] N(C X[t 1], σ2 X), σ2 X = 0.1, which we split into training and test sets of equal sizes. We repeat this procedure 20 times for each value of N. For all of the 20 experiments, we performed Niter = 10 (see 4.4.4) iterations in the alternating minimization algorithm (1) using half of the training set. The parameters λ and γ were selected by cross-validation minimizing the n RMSE over the second half of the training set (the validation set).
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No For all experiments, we used the package asgl (Álvaro Méndez Civieta et al., 2021) implemented using cvxpy (Diamond & Boyd, 2016) to solve Lasso and Adaptive Lasso regression problems.
Experiment Setup Yes For all of the 20 experiments, we performed Niter = 10 (see 4.4.4) iterations in the alternating minimization algorithm (1) using half of the training set. The parameters λ and γ were selected by cross-validation minimizing the n RMSE over the second half of the training set (the validation set). The accuracy is computed by : Acc = 1 i,j p Tτ (Ci,j)Yi,j p2 , where Tτ : x 7 ( 1 if |x| > τ 0 otherwise , taking τ = 0.05. We train the models with 80 points, still selecting λ and γ by cross-validation.