Statistical Inference for Noisy Incomplete Binary Matrix
Authors: Yunxiao Chen, Chengcheng Li, Jing Ouyang, Gongjun Xu
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A simulation study is given in Section 4, and two real-data applications are presented in Section 5. ... We study the finite-sample performance of the likelihood-based estimator. ... For each setting, 2000 independent data sets are generated from the considered model. |
| Researcher Affiliation | Academia | Yunxiao Chen EMAIL Department of Statistics London School of Economics and Political Science London WC2A 2AE, UK Chengcheng Li EMAIL Jing Ouyang EMAIL Gongjun Xu EMAIL Department of Statistics University of Michigan Ann Arbor, MI 48109, USA |
| Pseudocode | Yes | Algorithm 1: Projected Gradient Descent Algorithm Input: Partially observed data matrix Y , learning rates γ1 and γ2, tolerance ϵ, and initial values θ(1) = (θ(1) 1 , ..., θ(1) N )T and β(1) = (β(1) 1 , ..., β(1) J )T . Initialize l(0) = and l(1) = l(θ(1), β(1)), and iteration number t = 1; while (|l(t) l(t 1)| > ϵ) do |
| Open Source Code | Yes | The R code for our numerical experiments can be found in https://github.com/Austinlccvic/A-Note-on Statistical-Inference-for-Noisy-Incomplete-1-Bit-Matrix. |
| Open Datasets | Yes | The data set is a benchmark data set for studying linking methods for educational testing (Gonz alez and Wiberg, 2017). It contains binary responses from two forms of a college admission test. |
| Dataset Splits | No | The paper describes the composition of the datasets (e.g., N=4000, J=200, 40% missing for educational testing; N=139, J=1648, 26.1% missing for Senate voting), and how simulated data is generated, but it does not specify explicit training/validation/test splits for the datasets to reproduce experimental results. |
| Hardware Specification | No | The paper mentions "average CPU time per iteration" in the simulation study (Section 4), but does not provide any specific details about the CPU model, GPU, memory, or other hardware used for running the experiments. |
| Software Dependencies | No | The paper states, "The R code for our numerical experiments can be found in https://github.com/Austinlccvic/A-Note-on Statistical-Inference-for-Noisy-Incomplete-1-Bit-Matrix." While it specifies the programming language (R) and provides a link to the code, it does not mention specific version numbers for R or any R packages used, which are necessary for reproducibility. |
| Experiment Setup | Yes | Algorithm 1: Projected Gradient Descent Algorithm Input: Partially observed data matrix Y , learning rates γ1 and γ2, tolerance ϵ, and initial values θ(1) = (θ(1) 1 , ..., θ(1) N )T and β(1) = (β(1) 1 , ..., β(1) J )T . ... The convergence criteria is set to be the consecutive change in the joint log-likelihood is smaller than 0.001. |