Semi-supervised Inference for Block-wise Missing Data without Imputation

Authors: Shanshan Song, Yuanyuan Lin, Yong Zhou

JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive simulations are conducted to examine the theoretical results. The method is evaluated on the Alzheimer s Disease Neuroimaging Initiative data.
Researcher Affiliation Academia Shanshan Song EMAIL School of Mathematical Sciences and School of Economics and Management Tongji University Shanghai, 200092, China Yuanyuan Lin EMAIL Department of Statistics The Chinese University of Hong Kong Hong Kong, China Yong Zhou EMAIL Key Laboratory of Advanced Theory and Application in Statistics and Data Science, MOE, Academy of Statistics and Interdisciplinary Sciences and School of Statistics East China Normal University Shanghai, 200062, China
Pseudocode Yes Algorithm 1 Proximal gradient descent with momentum
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes The method is evaluated on the Alzheimer s Disease Neuroimaging Initiative data. Keywords: block-missing data, confidence intervals, hypothesis testing, semi-supervised inference ...In this subsection, we apply our method to analyze the ADNI data (Mueller et al., 2005).
Dataset Splits Yes In our analysis, the observations in the third phase of the ADNI study (ADNI-3) at year 2 visit are regarded as labelled data, and the observations in ADNI-2 at year 2 visit are treated as unlabelled data. To ensure independence of the labelled and unlabelled data, the subjects in the labelled data set are removed from the unlabelled data set on the basis of the visit code provided by the ADNI study. We normalize the response MMSE before analysis. Overall, 172 features are from MRI and 208 from PET. There are 334 labelled subjects, which include (1) 116 participants with complete MRI and PET features; (2) 102 participants with only MRI features; (3) 116 participants with only PET features. Thus, block missingness occurs when we integrate the data for a combined analysis. Meanwhile, 334 unlabelled participants are available. Thus, K = 3, n = 334, p = 380, N = 333, p1 = 172, p2 = 208, p3 = 380, n1 = 102 and n2 = n3 = 116.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies (e.g., library or solver names with version numbers) used for running the experiments.
Experiment Setup Yes For pointwise confidence intervals and hypothesis testing problems, we compare our proposed method with, if applicable, (i) Z. & Z.: the debiasing method proposed by Zhang and Zhang (2014) with scaled lasso using complete observations only; (ii) G. et al.: the debiasing method by van de Geer et al. (2014) with lasso using complete observations only; (iii) X. et al.: a revised version of the imputation-based method studied by Xue et al. (2021). Three error distributions are tried: (1) Standard normal distribution, denoted by N(0, 1); (2) Student s t distribution with degrees of freedom 3, denoted by t(3); (3) Weibull distribution with shape parameter 0.5 and scale parameter 0.3, denoted by WB(0.5, 0.3). We consider three simulated examples below: (E1): The predictor vector X follows Gaussian distribution N(0, Σ) with Σi,j = 0.4|i j|, and the three distributions of ϵ are tried. We set the target parameter β = (0.83, 0p/2 3, 0.83, 0p/2 3) , where p is the dimensionality of X. The number of covariates in the active set s0 = 6. We consider the block-missing structure with 2 modalities, as in Figure 1(a). The labelled samples are uniformly assigned to the two groups and the unlabelled data are independently generated from N(0, Σ). Here, K = 2, n = 200 or 300, p = 450, N = 1000 or 5000, p1 = p2 = 225 and n1 = n2 = n/2. (E2): The predictor vector X follows (i) Gaussian distribution N(0, Σ) with the covariance matrix satisfying Σi,j = 0.4|i j| or (ii) the Gaussian mixture distribution 0.5N(0, Σ(1))+0.5N(0, Σ(2)) with Σ(1) i,j = 0.4|i j|, Σ(2) i,j = 0.42|i j| and the above three settings of ϵ are considered. We set the target parameter β = (0.83, 0p/3 3, 0.83, 0p/3 3, 0.83, 0p/3 3) and s0 = 9. The block-missing structure with 3 modalities is presented in Figure 1(b). The Labelled samples are uniformly assigned to the three groups and the unlabelled data are independently generated from (i) or (ii). Here, K = 3, n = 300 or 600, p = 150, N = 5000, p1 = p2 = p3 = 50 and n1 = n2 = n3 = n/3. (E3): The predictor vector X follows Gaussian distribution N(0, Σ) with Σi,j = 0.4|i j| and ϵ N(0, 1). Set the target parameter β = (0.83, 0p/3 3, 0.83, 0p/3 3, 0.83, 0p/3 3) and s0 = 9. We consider the block-missing structure with 3 modalities, as in Figure 1(c). Samples with incomplete observations are randomly assigned to the first three groups with probabilities (0.4, 0.3, 0.3); samples with complete observations are generated independently; the unlabelled data are independently generated from N(0, Σ). Here, K = 4, n = 360, 400 or 500, p = 150, N = 5000 or 5000, p1 = 30, p2 = 90, p3 = 60, p4 = 150 and n4 = 60, 100 or 200. We set α = 0.05. For each component of θ(k), a bias-correction idea is to construct a projection direction by minimizing the variance with the bias constrained (Zhang and Zhang, 2014; Javanmard and Montanari, 2014). But it could be computationally expensive to identify such a projection direction for each component of θ(k). We use cross-validation to select λk among K groups in the numerical studies. ...the tuning parameter for the method X. et al. was selected by 10-folds cross-validation in each replication.