Matrix Completion with the Trace Norm: Learning, Bounding, and Transducing
Authors: Ohad Shamir, Shai Shalev-Shwartz
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted experiments on two standard matrix completion data sets, movielens100K and movielens1M. movielens100K contains 105 ratings of 943 users for 1770 movies, while movielens1M contains 106 ratings of 6040 users for 3706 movies. All ratings are in the range [1, 5]. For each data set, we performed a random 80% 20% of the data to obtain a training set and a test set. We considered two hypothesis classes: trace-norm constrained matrices {W : W tr t}, and bounded trace-norm constrained matrices {φ W : W tr t}, where φ is a sigmoid function interpolating between 1 and 5. For each hypothesis class, we trained a trace-norm regularized algorithm using the squared loss. The experiments were repeated 5 times over random train-test splits of the data, and the results are summarized in Table 1. |
| Researcher Affiliation | Academia | Ohad Shamir EMAIL Department of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot 7610001, Israel Shai Shalev-Shwartz EMAIL School of Computer Science and Engineering The Hebrew University Givat Ram, Jerusalem 9190401, Israel |
| Pseudocode | No | The paper describes mathematical concepts, theorems, and experimental procedures but does not include any explicitly labeled pseudocode or algorithm blocks. The optimization problems are presented as mathematical equations. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository. It mentions using a "common approach of stochastic gradient descent" but does not provide its own implementation. |
| Open Datasets | Yes | We conducted experiments on two standard matrix completion data sets, movielens100K and movielens1M. movielens100K contains 105 ratings of 943 users for 1770 movies, while movielens1M contains 106 ratings of 6040 users for 3706 movies. All ratings are in the range [1, 5]. These data sets are taken from www.grouplens.org/node/73 |
| Dataset Splits | Yes | For each data set, we performed a random 80% 20% of the data to obtain a training set and a test set. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments. It only mentions that they "were unable to perform experiments on very large-scale data sets, such as Netflix" due to computational reasons, which implies resource constraints but no specific hardware details. |
| Software Dependencies | No | The paper describes the algorithms used (e.g., "stochastic gradient descent on a factorized representation") and objective functions but does not mention any specific software libraries, frameworks, or their version numbers that were used for implementation. |
| Experiment Setup | No | The paper states that "Tuning of λ was performed with a validation set" for regularization, but it does not provide the specific values for λ or other hyperparameters such as learning rate, batch size, or number of epochs for the stochastic gradient descent algorithm used. |