Ground Metric Learning
Authors: Marco Cuturi, David Avis
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We follow the presentation of our algorithms with promising experimental results which show that this approach is useful both for retrieval and binary/multiclass classification tasks. |
| Researcher Affiliation | Academia | Marco Cuturi EMAIL David Avis EMAIL Graduate School of Informatics Kyoto University 36-1 Yoshida-Honmachi, Sakyo-ku Kyoto 606-8501, Japan |
| Pseudocode | Yes | Algorithm 1 Computation of z = S k (M) and a subgradient γ, where is either + or . Algorithm 2 Projected Subgradient Descent to minimize Ck Algorithm 3 Initial Point M0 to minimize Ck |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It mentions using external tools like "CPLEX Matlab API implementation of network flows" and "metric Nearness toolbox released online by Suvrit Sra" and an "implementation provided by the INRIA-LEAR team" but not its own code. |
| Open Datasets | Yes | We study in this section the performance of ground metric learning when coupled with a nearest neighbor classifier on binary classification tasks generated with the Caltech-256 database. 6 multiclass classification data sets that consider text and image data. The properties of the data sets and parameters used in our experiments are summarized in Table 1. The dimensions of the features have been kept low to ensure that the computation of optimal transports are tractable. We follow the recommended train/test splits for these data sets. If they are not provided, we split the data sets arbitrarily to form features using either LDA (Blei et al. 2003) or SIFT features (Lowe 1999). Table 1: Multiclass classification data sets and their parameters. 20 News Group Reuters MIT Scene UIUC Scene OXFORD Flower CALTECH-101 |
| Dataset Splits | Yes | For each pair, we split the 80 + 80 available points into 30+30 points to train distance parameters and 50+50 points to form a test set. This amounts to having n = 60 training points following the notations introduced in Section 3.1. |
| Hardware Specification | Yes | The algorithm takes about 300 steps to converge (Figures 8 and 9), which, using a single Xeon 2.6Ghz core, 60 training points and d = 128 (the experimental setting considered below) takes about 900 seconds. |
| Software Dependencies | No | The paper mentions using "CPLEX Matlab API implementation of network flows" and the "metric Nearness toolbox released online by Suvrit Sra" but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | The neighborhood parameter k is set to 3 to be directly comparable to the default parameter setting of ITML and LMNN. In each classification task, and for two images ri and rj, the corresponding weight ωij is set to 1/nk if both histograms come from the same class and to 1/nk if they come from different classes. The subgradient stepsize t0 of Algorithm 2 is set to = 0.1, guided by preliminary experiments and by the fact that, because of the normalization of the weights ωij, both the current iteration Mk in Algorithm 2 and subgradients γ+ or γ all have the same 1-norms. We carry out a minimum of 24 subgradient steps in each inner loop and set qmax to 80. Each inner loop is terminated when the objective does not progress more than 0.75% every 8 steps, or when q reaches qmax. We carry out a maximum of 20 outer loop iterations. |