metric-learn: Metric Learning Algorithms in Python

Authors: William de Vazelhes, CJ Carey, Yuan Tang, Nathalie Vauquier, Aurélien Bellet

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classifier on the UCI Wine data set, with the hyperparameters selected with a grid-search. Any other supervised metric learner can be used in place of LMNN. from sklearn.datasets import load_wine from sklearn.neighbors import KNeighbors Classifier from sklearn.model_selection import train_test_split, Grid Search CV from sklearn.pipeline import Pipeline from metric_learn import LMNN X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) grid_lmnn_knn.fit(X_train, y_train) grid_lmnn_knn.score(X_test, y_test) To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012). from sklearn.datasets import fetch_lfw_pairs from sklearn.model_selection import cross_validate, train_test_split from metric_learn import MMC
Researcher Affiliation Collaboration William de Vazelhes EMAIL Paris Research Center, Huawei Technologies 92100 Boulogne-Billancourt, France CJ Carey EMAIL Google LLC 111 8th Ave, New York, NY 10011, USA Yuan Tang EMAIL Ant Group 525 Almanor Ave, Sunnyvale, CA 94085, USA Nathalie Vauquier EMAIL Aur elien Bellet EMAIL Magnet Team, INRIA Lille Nord Europe 59650 Villeneuve d Ascq, France
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks describing the metric learning algorithms. It provides Python code snippets illustrating how to use the 'metric-learn' library, but these are not pseudocode for the algorithms themselves. The algorithms are referred to by name and citation (e.g., NCA, LMNN).
Open Source Code Yes metric-learn is an open source Python package implementing supervised and weaklysupervised distance metric learning algorithms. As part of scikit-learn-contrib, it provides a unified interface compatible with scikit-learn which allows to easily perform cross-validation, model selection, and pipelining with other machine learning estimators. metric-learn is thoroughly tested and available on Py Pi under the MIT license. ... The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license.
Open Datasets Yes To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classifier on the UCI Wine data set... ... To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012).
Dataset Splits Yes X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) ... grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True)
Hardware Specification No The paper does not provide specific hardware details (such as GPU/CPU models, memory, or specific computing environments) used for running the experiments or for the development of the 'metric-learn' package.
Software Dependencies No The current release of metric-learn (v0.6.2) can be installed from the Python Package Index (Py PI) and conda-forge, for Python 3.6 or later.3 The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license. metric-learn depends on core libraries from the Sci Py ecosystem: numpy, scipy, and scikit-learn.
Experiment Setup Yes lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True)