On the Bayes-Optimality of F-Measure Maximizers
Authors: Willem Waegeman, Krzysztof Dembczyński, Arkadiusz Jachnik, Weiwei Cheng, Eyke Hüllermeier
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. ... In Section 7, we present extensive experimental results to illustrate the practical usefulness of our findings. More specifically, all examined methods are compared for a series of multi-label classification problems. |
| Researcher Affiliation | Collaboration | Willem Waegeman EMAIL Department of Mathematical Modelling, Statistics and Bioinformatics Ghent University, Ghent 9000 Belgium ... Weiwei Cheng EMAIL Amazon Development Center Germany, Berlin 10707 Germany |
| Pseudocode | Yes | Algorithm 1 General F-measure Maximizer |
| Open Source Code | No | The paper mentions external software for comparison: "The results were obtained by using the software available at http://users.cecs.anu.edu.au/~jpetterson/." However, it does not provide an explicit statement or link for the code implementing the methodology described by the authors in this paper (e.g., the GFM algorithm). |
| Open Datasets | Yes | We test some of the algorithms described above on four commonly used multi-label benchmark data sets with known training and test sets. We take these data sets from the MULAN8 and Lib SVM9 repositories. [8] This repository can be found at: http://mulan.sourceforge.net/datasets.html. [9] This repository can be found at: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multilabel.html. |
| Dataset Splits | Yes | We test some of the algorithms described above on four commonly used multi-label benchmark data sets with known training and test sets. ... We use 5-fold cross-validation and choose the regularization parameter from the following set of possible values {10 4, 10 3, . . . , 103}. ... This is a minor difference in comparison to the competition results, which are computed over 90% of test examples. The remaining 10% of test examples constitute a validation set that served for computing the scores for the leader board during the competition. |
| Hardware Specification | Yes | We run these simulations, as well as the other experiments described later in this paper, on a Debian virtual machine with 8-core x64 processor and 5GB RAM. |
| Software Dependencies | No | The paper mentions using "Weka (Hall et al., 2009)", "Mulan (Tsoumakas et al., 2011)", and "Mallet (Mc Callum, 2002)". However, specific version numbers for these software packages or programming languages are not provided. |
| Experiment Setup | Yes | We use a different number of nearest neighbors, l {10, 20, 50, 100}. ... We use 5-fold cross-validation and choose the regularization parameter from the following set of possible values {10 4, 10 3, . . . , 103}. ... The maximal number of iterations in the cutting-plane algorithm has been set to 1000. |