reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

metric-learn: Metric Learning Algorithms in Python

Authors: William de Vazelhes, CJ Carey, Yuan Tang, Nathalie Vauquier, Aurélien Bellet

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classiﬁer on the UCI Wine data set, with the hyperparameters selected with a grid-search. Any other supervised metric learner can be used in place of LMNN. from sklearn.datasets import load_wine from sklearn.neighbors import KNeighbors Classifier from sklearn.model_selection import train_test_split, Grid Search CV from sklearn.pipeline import Pipeline from metric_learn import LMNN X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) grid_lmnn_knn.fit(X_train, y_train) grid_lmnn_knn.score(X_test, y_test) To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012). from sklearn.datasets import fetch_lfw_pairs from sklearn.model_selection import cross_validate, train_test_split from metric_learn import MMC
Researcher Affiliation	Collaboration	William de Vazelhes EMAIL Paris Research Center, Huawei Technologies 92100 Boulogne-Billancourt, France CJ Carey EMAIL Google LLC 111 8th Ave, New York, NY 10011, USA Yuan Tang EMAIL Ant Group 525 Almanor Ave, Sunnyvale, CA 94085, USA Nathalie Vauquier EMAIL Aur elien Bellet EMAIL Magnet Team, INRIA Lille Nord Europe 59650 Villeneuve d Ascq, France
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks describing the metric learning algorithms. It provides Python code snippets illustrating how to use the 'metric-learn' library, but these are not pseudocode for the algorithms themselves. The algorithms are referred to by name and citation (e.g., NCA, LMNN).
Open Source Code	Yes	metric-learn is an open source Python package implementing supervised and weaklysupervised distance metric learning algorithms. As part of scikit-learn-contrib, it provides a uniﬁed interface compatible with scikit-learn which allows to easily perform cross-validation, model selection, and pipelining with other machine learning estimators. metric-learn is thoroughly tested and available on Py Pi under the MIT license. ... The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license.
Open Datasets	Yes	To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classiﬁer on the UCI Wine data set... ... To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012).
Dataset Splits	Yes	X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) ... grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True)
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU/CPU models, memory, or specific computing environments) used for running the experiments or for the development of the 'metric-learn' package.
Software Dependencies	No	The current release of metric-learn (v0.6.2) can be installed from the Python Package Index (Py PI) and conda-forge, for Python 3.6 or later.3 The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license. metric-learn depends on core libraries from the Sci Py ecosystem: numpy, scipy, and scikit-learn.
Experiment Setup	Yes	lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True)