reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Probabilistic Matrix Factorization for Automated Machine Learning

Authors: Nicolo Fusi, Rishit Sheth, Melih Elibol

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we show that our approach quickly identiﬁes high-performing pipelines across a wide range of datasets, signiﬁcantly outperforming the current state-of-the-art.
Researcher Affiliation	Collaboration	Nicolo Fusi, Rishit Sheth Microsoft Research, New England EMAIL Melih Elibol EECS, University of California, Berkeley EMAIL
Pseudocode	No	The paper describes methods and equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Data and software available at https://github.com/rsheth80/pmf-automl/
Open Datasets	Yes	We ran all of the experiments on 553 Open ML [28] datasets
Dataset Splits	Yes	We generated training data for our method by splitting each Open ML dataset in 80% training data, 10% validation data and 10% test data
Hardware Specification	No	The paper mentions 'approximately 3 hours on a 16-core Azure machine', but does not specify exact CPU models, GPU models, or memory details.
Software Dependencies	No	The paper mentions software like 'scikit-learn [17]' and 'auto-sklearn library [4]' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We set the number of latent dimensions to Q = 20, stochastic gradient descent learning rate to η = 1e 7, and (column) batch-size to 50. The latent space was initialized using PCA, and training was run for 300 epochs (corresponding to approximately 3 hours on a 16-core Azure machine). Finally, we conﬁgured the acquisition function with ξ = 0.012.