Optimal Estimation and Completion of Matrices with Biclustering Structures
Authors: Chao Gao, Yu Lu, Zongming Ma, Harrison H. Zhou
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Implementation and simulation results are given in Section 5. Now we present some numerical results to demonstrate the accuracy of the error rate behavior suggested by Theorem 1 on simulated data. |
| Researcher Affiliation | Academia | Chao Gao EMAIL Yu Lu EMAIL Yale University Zongming Ma EMAIL University of Pennsylvania Harrison H. Zhou EMAIL Yale University |
| Pseudocode | Yes | Algorithm 1: A Biclustering Algorithm |
| Open Source Code | No | The paper does not contain any explicit statements about providing open-source code or links to a code repository for the described methodology. |
| Open Datasets | No | Now we present some numerical results to demonstrate the accuracy of the error rate behavior suggested by Theorem 1 on simulated data. We first generate our data from SBM with the number of blocks k {2, 4, 8, 16}. The observation rate p = 0.5. |
| Dataset Splits | No | The paper describes generating simulated data for numerical studies but does not specify any training/test/validation dataset splits. The evaluation involves comparing the estimator's error rate behavior on these generated datasets. |
| Hardware Specification | No | The paper provides numerical results on simulated data but does not mention any specific hardware used for running these simulations or experiments. |
| Software Dependencies | No | The paper mentions algorithms like 'k-means algorithm' and 'singular value decomposition' but does not specify any software names with version numbers used for implementation or analysis. |
| Experiment Setup | Yes | Our theoretical result indicates the rate of recovery is rρpk2n2 + log k for the root mean squared error (RMSE) 1n ˆθ θ. When k is not too large, the dominating pn . We are going to confirm this rate by simulation. We first generate our data from SBM with the number of blocks k {2, 4, 8, 16}. The observation rate p = 0.5. For every fixed k, we use four different Q = 0.51k1T k +0.1t Ik with t = 1, 2, 3, 4 and generate the community labels z uniformly on [k]. Then we calculate the error 1n ˆθ θ. Panel (a) of Figure 1 shows the error versus the sample size n. ... We simulate data with Gaussian noise under four different settings of k1 and k2. For each (k1, k2) {(4, 4), (4, 8), (8, 8), (8, 12)}, the entries of matrix Q are independently and uniformly generated from {1, 2, 3, 4, 5}. The cluster labels z1 and z2 are uniform on [k1] and [k2] respectively. After generating Q, z1 and z2, we add an N(0, 1) noise to the data and observe Xij with probability p = 0.1. |