Estimating Network-Mediated Causal Effects via Principal Components Network Regression

Authors: Alex Hayes, Mark M. Fredrickson, Keith Levin

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We develop a method to decompose causal effects on a social network into an indirect effect mediated by the network, and a direct effect independent of the social network. ... We prove asymptotic theory for regression coefficients from this procedure and show that it is widely applicable, allowing for a variety of distributions on the regression errors and network edges. ... In data applications, we find that adolescent girls end up smoking more than adolescent boys primarily due to an indirect network effect. ... 4. Simulations ... 5. Data Applications
Researcher Affiliation Academia Alex Hayes EMAIL Department of Statistics University of Wisconsin Madison Madison, WI, USA; Mark M. Fredrickson EMAIL Department of Statistics University of Michigan Ann Arbor, MI, USA; Keith Levin EMAIL Department of Statistics University of Wisconsin Madison Madison, WI, USA
Pseudocode No The paper describes methods and processes in detail, such as the Principal Components Network Regression and estimation steps, but it does not include any explicitly labeled 'Pseudocode', 'Algorithm', or 'Algorithm X' blocks or figures that present structured, code-like procedures.
Open Source Code Yes A replication package for our simulations and data analysis is available at https://github. com/alexpghayes/network-mediation-replication.
Open Datasets Yes We first revisit the Teenage Friends and Lifestyle Study described in Section 2.1, focusing on the causal effect of sex on smoking during the first wave of the study. ... Michell and West (1996), Michell and Amos (1997), Michell (1997), and Michell (2000). ... We next use our method to re-analyze data from a randomized controlled trial of a smartphonebased well-being training called the Healthy Minds Program, originally reported in Hirshberg et al. (2022). ... To demonstrate this, we re-analyzed the Add Health data set investigated in the initial pre-print of Le and Li (2022).
Dataset Splits No The paper mentions collecting data in 'three waves' for the Teenage Friends and Lifestyle Study and focusing on 'the first wave'. For the Healthy Minds data, it states 'we only consider the 533 study participants who responded to all survey questions at the end of the intervention period'. It also states 'The Add Health data consists of a self-reported social network of 2,152 high school students'. These describe subsets or characteristics of the data used but do not provide specific training/test/validation splits or cross-validation details for experimental reproduction.
Hardware Specification No The paper does not contain any specific hardware details such as GPU/CPU models, memory amounts, or cloud computing instance types used for experiments or simulations.
Software Dependencies No The paper describes the statistical methods used, such as 'ordinary least squares' and 'Huber-White robust standard errors', and mentions concepts like 'singular value decomposition', but does not specify any software names with version numbers, programming languages with their versions, or specific libraries used for implementation.
Experiment Setup Yes In our results below, we find that our two-stage regression estimators are able to reliably recover regression coefficients and mediated effects, up to orthogonal nonidentifiability where appropriate. We conduct simulations using two separate models to generate network structure, both based on the degree-corrected stochastic blockmodel. ... For each model, we sample (A, Y, W) for varying number of nodes n and latent dimensions d, and compute point estimates and confidence intervals for bΘ, bβ, bΨnde and bΨnie. ... smoking sex + age + church + Fhat; Fhat sex + age + church.