metric-learn: Metric Learning Algorithms in Python
Authors: William de Vazelhes, CJ Carey, Yuan Tang, Nathalie Vauquier, Aurélien Bellet
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classifier on the UCI Wine data set, with the hyperparameters selected with a grid-search. Any other supervised metric learner can be used in place of LMNN. from sklearn.datasets import load_wine from sklearn.neighbors import KNeighbors Classifier from sklearn.model_selection import train_test_split, Grid Search CV from sklearn.pipeline import Pipeline from metric_learn import LMNN X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) grid_lmnn_knn.fit(X_train, y_train) grid_lmnn_knn.score(X_test, y_test) To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012). from sklearn.datasets import fetch_lfw_pairs from sklearn.model_selection import cross_validate, train_test_split from metric_learn import MMC |
| Researcher Affiliation | Collaboration | William de Vazelhes EMAIL Paris Research Center, Huawei Technologies 92100 Boulogne-Billancourt, France CJ Carey EMAIL Google LLC 111 8th Ave, New York, NY 10011, USA Yuan Tang EMAIL Ant Group 525 Almanor Ave, Sunnyvale, CA 94085, USA Nathalie Vauquier EMAIL Aur elien Bellet EMAIL Magnet Team, INRIA Lille Nord Europe 59650 Villeneuve d Ascq, France |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks describing the metric learning algorithms. It provides Python code snippets illustrating how to use the 'metric-learn' library, but these are not pseudocode for the algorithms themselves. The algorithms are referred to by name and citation (e.g., NCA, LMNN). |
| Open Source Code | Yes | metric-learn is an open source Python package implementing supervised and weaklysupervised distance metric learning algorithms. As part of scikit-learn-contrib, it provides a unified interface compatible with scikit-learn which allows to easily perform cross-validation, model selection, and pipelining with other machine learning estimators. metric-learn is thoroughly tested and available on Py Pi under the MIT license. ... The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license. |
| Open Datasets | Yes | To illustrate, the following code snippet trains a Pipeline composed of LMNN followed by a k-nearest neighbors classifier on the UCI Wine data set... ... To illustrate the weakly-supervised learning API, the following code snippet computes cross validation scores for MMC on pairs from Labeled Faces in the Wild (Huang et al., 2012). |
| Dataset Splits | Yes | X_train, X_test, y_train, y_test = train_test_split(*load_wine(return_X_y=True)) ... grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True) |
| Hardware Specification | No | The paper does not provide specific hardware details (such as GPU/CPU models, memory, or specific computing environments) used for running the experiments or for the development of the 'metric-learn' package. |
| Software Dependencies | No | The current release of metric-learn (v0.6.2) can be installed from the Python Package Index (Py PI) and conda-forge, for Python 3.6 or later.3 The source code is available on Git Hub at http://github.com/scikit-learn-contrib/metric-learn and is free to use, provided under the MIT license. metric-learn depends on core libraries from the Sci Py ecosystem: numpy, scipy, and scikit-learn. |
| Experiment Setup | Yes | lmnn_knn = Pipeline(steps=[( lmnn , LMNN()), ( knn , KNeighbors Classifier())]) parameters = { lmnn__k :[1, 2], knn__n_neighbors :[1, 2]} grid_lmnn_knn = Grid Search CV(lmnn_knn, parameters, cv=3, n_jobs=-1, verbose=True) ... cross_validate(MMC(diagonal=True), pairs, y_pairs, scoring= roc_auc , return_train_score=True, cv=3, n_jobs=-1, verbose=True) |