DAG-Informed Structure Learning from Multi-Dimensional Point Processes
Authors: Chunming Zhang, Muhong Gao, Shengji Jia
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, simulation studies indicate that our proposed DAG-constrained estimator, when appropriately penalized, yields more accurate graphs compared to unconstrained or unregularized estimators. Finally, we apply the proposed method to two real Mu TPP datasets. Keywords: asymptotic consistency; causal structure; constrained optimization; multivariate counting process; Structural Hamming Distance. |
| Researcher Affiliation | Academia | Chunming Zhang EMAIL Department of Statistics University of Wisconsin-Madison Madison, WI 53706, USA Muhong Gao EMAIL Academy of Mathematics and System Science Chinese Academy of Sciences Beijing 100190, China Shengji Jia EMAIL School of Statistics and Mathematics Shanghai Lixin University of Accounting and Finance Shanghai, China |
| Pseudocode | Yes | Algorithm 1 Flexible Augmented Lagrangian (Flex-AL) algorithm for solving (12) Algorithm 2 Proximal Quasi-Newton (PXQN) algorithm for solving (25) |
| Open Source Code | No | The paper does not contain an explicit statement about releasing its own source code, nor does it provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | We analyze the neuronal spike train dataset in Fujisawa et al. (2008), comprising multineuron recordings obtained from rats performing a working memory task. This dataset, available at http://crcns.org/data-sets/pfc/pfc-2/about-pfc-2... We test our proposed method on the IPTV (Internet Protocol television) viewing record dataset, which is publicly available and accessible at https://ieee-dataport.org. |
| Dataset Splits | Yes | Following the cleaning process, we partition the data into two sets: the training set encompassing spikes within the time range of [100, 1600] seconds, and the testing set containing spikes between [1600, 2600] seconds... This cleaned data is split into the training set (containing timestamps in the first 400 hours) and the testing set (containing timestamps in the following 368 hours). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | The point process data is generated using model (4) with true covariates: xi(t) = g Ni((t φ, t])/φ , t [0, T], i = 1, . . . , d, where φ = 1, and g(x) = log{1 + min(x, 10)}. For each node j = 1, . . . , d, the true baseline parameters are set as w 0,j = 0.8. For i, j = 1, . . . , d, the true interaction parameters from node i to node j are w i,j = 0.5 for an excitatory effect, w i,j = 0.5 for an inhibitory effect, w i,j = 0 for no effect. The total time length T is selected from grid points ranging between 300 and 1600... Both tuning parameters η1 and η2 are chosen by minimizing the Bayesian Information Criterion (BIC) function (Nishii, 1984)... For each simulation replication, both algorithms are employed using the DAG-w L1 and DAG-unreg methods to estimate Network 3. The total time length of the synthetic point process data was fixed at T = 1200. To ensure a fair comparison, identical step sizes {βα, βρ, γα, γρ} were utilized in both algorithms (refer to (20), (21), (23), and (24)), specifically set to βα = βρ = γα = γρ = 5. Both algorithms adopted the same stopping rule described in Section 4.2, terminating once h(W(k)) < ϵh at a ceratin iteration step k = bk, with ϵh = 10 5. |