Logistic Regression Under Network Dependence

Authors: Somabha Mukherjee, Ziang Niu, Sagnik Halder, Bhaswar B. Bhattacharya, George Michailidis

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental we develop an efficient algorithm for computing the estimates and validate our theoretical results in numerical experiments. An application to selecting genes in clustering spatial transcriptomics data is also discussed. ... We evaluate the performance of the PMPL estimator using Algorithm 1 on synthetic data.
Researcher Affiliation Academia Somabha Mukherjee EMAIL Department of Statistics and Data Science National University of Singapore, Singapore Ziang Niu EMAIL Department of Statistics and Data Science University of Pennsylvania, Philadelphia, USA Sagnik Halder EMAIL Department of Statistics University of Florida, Gainesville, USA Bhaswar B. Bhattacharya EMAIL Department of Statistics and Data Science University of Pennsylvania, Philadelphia, USA George Michailidis EMAIL Department of Statistics and Data Science University of California, Los Angeles, Los Angeles, USA
Pseudocode Yes Algorithm 1 Fix a value of τ (0, 1) and δ > 0. Initialize with γ(0) = 0 Rd+1 and t(0) = 1. while γ(s+1) γ(s) 1 > δ do if LN(γ(s) t(s)Gt(s)(γ(s))) LN(γ(s)) t(s) LN(γ(s)) Gt(s)(γ(s))+ t(s) 2 Gt(s)(γ(s)) 2 2 then γ(s+1) proxt(s)(γ(s) t(s) LN(γ(s))) and t(s+1) t(s). end if if LN(γ(s) t(s)Gt(s)(γ(s))) > LN(γ(s)) t(s) LN(γ(s)) Gt(s)(γ(s))+ t(s) 2 Gt(s)(γ(s)) 2 2 then γ(s+1) γ(s) and t(s+1) τt(s). end if end while
Open Source Code No The paper does not provide an explicit statement about the open-source availability of their implementation code nor a direct link to a code repository. It mentions the general license for the paper itself and third-party tools like 'scanpy' and the 'Leiden algorithm'.
Open Datasets Yes We consider the Visium spatial gene expression data set for human breast cancer (see https://www.10xgenomics.com/spatial-transcriptomics for details about the spatial capture technology) available in the Python package scanpy. The data set is available at https://support.10xgenomics.com/spatial-gene-expression/datasets
Dataset Splits No The paper describes generating synthetic data for numerical experiments and applying clustering to spatial transcriptomics data. It does not provide specific training/test/validation dataset splits for model evaluation or generalization, as the main application involves clustering rather than a supervised learning task with explicit data splits.
Hardware Specification No The paper describes computational methods and numerical experiments but does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used to perform these computations.
Software Dependencies No The paper mentions using the 'Python package scanpy' and the 'Leiden algorithm', and refers to 'Python command', but it does not specify version numbers for these or any other software dependencies needed to replicate the experiment.
Experiment Setup Yes For the numerical experiments, we consider β = 0.3 and choose the first s regression coefficients θ1, θ2, . . . , θs independently from Uniform([ 1, 1/2, 1]), while the remaining d s regression coefficients θs+1, θs+2, . . . , θd are set to zero. The covariates Z1, Z2, . . . , ZN are sampled i.i.d. from a d-dimensional multivariate Gaussian distribution with mean vector 0 and covariance matrix Σ = ((σij))1 i,j d, with σij = 0.2|i j|. ... by running the Gibbs sampling algorithm described above for 30000 iterations. We then apply Algorithm 1 by setting ε = 0.001, τ = 0.8, and consider the solution paths of the PMPL estimator as a function of log(λ), when the network GN is selected to be the Erd os-R enyi (ER) model and the stochastic block model (SBM). We set the range of λ to be a geometric sequence of length 100 from 0.001 to 0.1. ... To select the regularization parameter λ, we use a Bayesian Information Criterion (BIC). ... We repeat the experiment with K = 1, K = 2, and K = 3.