Supervised Word Mover's Distance
Authors: Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger
NeurIPS 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate S-WMD on eight real-world text classification tasks on which it consistently outperforms almost all of our 26 competitive baselines. We evaluate all approaches on 8 document datasets in the settings of news categorization, sentiment analysis, and product identification, among others. Table 1 describes the classification tasks as well as the size and number of classes C of each of the datasets. We evaluate against the following document representation/distance methods: ... Table 2: The k NN test error for all datasets and distances. |
| Researcher Affiliation | Academia | Gao Huang , Chuan Guo Cornell University EMAIL Matt J. Kusner Alan Turing Institute, University of Warwick EMAIL Yu Sun, Kilian Q. Weinberger Cornell University EMAIL Fei Sha University of California, Los Angeles EMAIL |
| Pseudocode | Yes | Algorithm 1 S-WMD |
| Open Source Code | Yes | Our code is implemented in Matlab and is freely available at https://github.com/gaohuang/S-WMD. |
| Open Datasets | Yes | We evaluate S-WMD on 8 different document corpora... Table 1: The document datasets (and their descriptions) used for visualization and evaluation. ... REUTERS news dataset (train/test split [3]) ... TWITTER tweets categorized by sentiment [31] ... 20NEWS canonical news article dataset [3] |
| Dataset Splits | No | For datasets that do not have a predefined train/test split: BBCSPORT, TWITTER, RECIPE, CLASSIC, and AMAZON we average results over five 70/30 train/test splits and report standard errors. The paper does not explicitly state training/validation/test splits or mention a specific validation split percentage for the data. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | Our code is implemented in Matlab. The paper does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | In our experiments, we use λ = 10, which leads to a nice trade-off between speed and approximation accuracy. In our experiments, we set B = 32 and N = 200, and computing the gradient at each iteration can be done in seconds. |