reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Matrix Completion with the Trace Norm: Learning, Bounding, and Transducing

Authors: Ohad Shamir, Shai Shalev-Shwartz

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments on two standard matrix completion data sets, movielens100K and movielens1M. movielens100K contains 105 ratings of 943 users for 1770 movies, while movielens1M contains 106 ratings of 6040 users for 3706 movies. All ratings are in the range [1, 5]. For each data set, we performed a random 80% 20% of the data to obtain a training set and a test set. We considered two hypothesis classes: trace-norm constrained matrices {W : W tr t}, and bounded trace-norm constrained matrices {φ W : W tr t}, where φ is a sigmoid function interpolating between 1 and 5. For each hypothesis class, we trained a trace-norm regularized algorithm using the squared loss. The experiments were repeated 5 times over random train-test splits of the data, and the results are summarized in Table 1.
Researcher Affiliation	Academia	Ohad Shamir EMAIL Department of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot 7610001, Israel Shai Shalev-Shwartz EMAIL School of Computer Science and Engineering The Hebrew University Givat Ram, Jerusalem 9190401, Israel
Pseudocode	No	The paper describes mathematical concepts, theorems, and experimental procedures but does not include any explicitly labeled pseudocode or algorithm blocks. The optimization problems are presented as mathematical equations.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository. It mentions using a "common approach of stochastic gradient descent" but does not provide its own implementation.
Open Datasets	Yes	We conducted experiments on two standard matrix completion data sets, movielens100K and movielens1M. movielens100K contains 105 ratings of 943 users for 1770 movies, while movielens1M contains 106 ratings of 6040 users for 3706 movies. All ratings are in the range [1, 5]. These data sets are taken from www.grouplens.org/node/73
Dataset Splits	Yes	For each data set, we performed a random 80% 20% of the data to obtain a training set and a test set.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments. It only mentions that they "were unable to perform experiments on very large-scale data sets, such as Netflix" due to computational reasons, which implies resource constraints but no specific hardware details.
Software Dependencies	No	The paper describes the algorithms used (e.g., "stochastic gradient descent on a factorized representation") and objective functions but does not mention any specific software libraries, frameworks, or their version numbers that were used for implementation.
Experiment Setup	No	The paper states that "Tuning of λ was performed with a validation set" for regularization, but it does not provide the specific values for λ or other hyperparameters such as learning rate, batch size, or number of epochs for the stochastic gradient descent algorithm used.