Statistical Inference for Noisy Incomplete Binary Matrix

Authors: Yunxiao Chen, Chengcheng Li, Jing Ouyang, Gongjun Xu

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A simulation study is given in Section 4, and two real-data applications are presented in Section 5. ... We study the finite-sample performance of the likelihood-based estimator. ... For each setting, 2000 independent data sets are generated from the considered model.
Researcher Affiliation Academia Yunxiao Chen EMAIL Department of Statistics London School of Economics and Political Science London WC2A 2AE, UK Chengcheng Li EMAIL Jing Ouyang EMAIL Gongjun Xu EMAIL Department of Statistics University of Michigan Ann Arbor, MI 48109, USA
Pseudocode Yes Algorithm 1: Projected Gradient Descent Algorithm Input: Partially observed data matrix Y , learning rates γ1 and γ2, tolerance ϵ, and initial values θ(1) = (θ(1) 1 , ..., θ(1) N )T and β(1) = (β(1) 1 , ..., β(1) J )T . Initialize l(0) = and l(1) = l(θ(1), β(1)), and iteration number t = 1; while (|l(t) l(t 1)| > ϵ) do
Open Source Code Yes The R code for our numerical experiments can be found in https://github.com/Austinlccvic/A-Note-on Statistical-Inference-for-Noisy-Incomplete-1-Bit-Matrix.
Open Datasets Yes The data set is a benchmark data set for studying linking methods for educational testing (Gonz alez and Wiberg, 2017). It contains binary responses from two forms of a college admission test.
Dataset Splits No The paper describes the composition of the datasets (e.g., N=4000, J=200, 40% missing for educational testing; N=139, J=1648, 26.1% missing for Senate voting), and how simulated data is generated, but it does not specify explicit training/validation/test splits for the datasets to reproduce experimental results.
Hardware Specification No The paper mentions "average CPU time per iteration" in the simulation study (Section 4), but does not provide any specific details about the CPU model, GPU, memory, or other hardware used for running the experiments.
Software Dependencies No The paper states, "The R code for our numerical experiments can be found in https://github.com/Austinlccvic/A-Note-on Statistical-Inference-for-Noisy-Incomplete-1-Bit-Matrix." While it specifies the programming language (R) and provides a link to the code, it does not mention specific version numbers for R or any R packages used, which are necessary for reproducibility.
Experiment Setup Yes Algorithm 1: Projected Gradient Descent Algorithm Input: Partially observed data matrix Y , learning rates γ1 and γ2, tolerance ϵ, and initial values θ(1) = (θ(1) 1 , ..., θ(1) N )T and β(1) = (β(1) 1 , ..., β(1) J )T . ... The convergence criteria is set to be the consecutive change in the joint log-likelihood is smaller than 0.001.