A Kernel-based Test of Independence for Cluster-correlated Data
Authors: Hongjiao Liu, Anna Plantinga, Yunhua Xiang, Michael Wu
NeurIPS 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Based on both simulation studies and real data analysis, we show that, with clustered data, our approach effectively controls type I error and has a higher statistical power than competing methods. |
| Researcher Affiliation | Academia | Hongjiao Liu Department of Biostatistics University of Washington EMAIL Anna M. Plantinga Department of Mathematics and Statistics Williams College EMAIL Yunhua Xiang Department of Biostatistics University of Washington EMAIL Michael C. Wu Public Health Sciences Division Fred Hutchinson Cancer Research Center EMAIL |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | All of our codes are implemented in R, and are available at https://github.com/Liujiao92/HSICcl. |
| Open Datasets | Yes | Here we apply HSICcl and competing methods to test the dependence between the overall vaginal microbiome composition and different metabolic pathways, using data from the Menopause Strategies: Finding Lasting Answers for Symptoms and Health (Ms FLASH) Vaginal Health Trial [27]. |
| Dataset Splits | No | The paper does not specify explicit training, validation, or test dataset splits. It mentions using m clusters and d time points for simulations and real data analysis, but no partitioning for model training/validation/testing. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'All of our codes are implemented in R' but does not specify the version of R or any specific R libraries with their version numbers. |
| Experiment Setup | Yes | For both X and Y , we consider two different kernels: the Gaussian kernel k X(z1, z2) = k Y (z1, z2) = exp( z1 z2 2 2/τ) and the linear kernel k X(z1, z2) = k Y (z1, z2) = z T 1 z2. For the Gaussian kernel, the shape parameter τ is chosen as the median of the Euclidean distance between each sample pair. |