On Multilabel Classification and Ranking with Bandit Feedback

Authors: Claudio Gentile, Francesco Orabona

JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Though the emphasis is on theoretical results, we also validate our algorithms on real-world multilabel data sets under several experimental conditions: data set size, label set size, loss functions, training mode and performance (online vs. batch), label generation model (linear vs. logistic). Under all such conditions, our algorithms are contrasted against the corresponding multilabel/ranking baselines that operate with full information, often showing (surprisingly enough) comparable prediction performance.
Researcher Affiliation Academia Claudio Gentile EMAIL Di STA, Universit a dell Insubria via Mazzini 5 21100 Varese, Italy Francesco Orabona EMAIL Toyota Technological Institute at Chicago 6045 South Kenwood Avenue 60637 Chicago, IL, USA
Pseudocode Yes Figure 1: The partial feedback algorithm in the (ordered) multiple label setting the linear model case. Figure 2: The partial feedback algorithm in the (ordered) multiple label setting the generalized linear model case.
Open Source Code No The paper does not provide explicit statements about open-sourcing the code, nor does it include links to a code repository.
Open Datasets Yes We used three diverse multilabel data sets, intended to represent different real-world conditions. The first one, called Mediamill, was introduced in a video annotation challenge (Snoek et al., 2006). [...] The second data set is the music annotated Sony CSL Paris data set (Pachet and Roy, 2009), [...] The third one is the smaller Yeast data set (Elisseeffand Weston, 2002).
Dataset Splits Yes The first one, called Mediamill, was introduced in a video annotation challenge (Snoek et al., 2006). It comprises 30, 993 training samples and 12, 914 test ones. [...] The second data set is the music annotated Sony CSL Paris data set (Pachet and Roy, 2009), made up of 16, 452 training samples and 16,519 test samples, [...] The third one is the smaller Yeast data set (Elisseeffand Weston, 2002), made up of 1, 500 training samples, 917 test samples, with d = 103 and K = 14.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper does not mention specific software names with version numbers required to replicate the experiment.
Experiment Setup Yes For the practical implementation of the algorithm in Figure 2, we simplified the formula for ϵ2 i,t. [...] where α is a parameter that we found by cross-validation on each data set across the range α = 2 8, 2 7, . . . , 27, 28, for each choice of the label-generation model, loss setting, and value of S see below. We have considered two different loss functions L, the square loss and the logistic loss (denoted by Log Loss in our plots). [...] In the logistic case, it makes sense in practice not to place any restrictions on the margin domain D, so that we set R = . Again, because our upper bounding analysis would yield as a consequence c L = 0, we instead set c L to a small positive constant, specifically c L = 0.1, with no special attention to its fine-tuning. The setting of the cost function c(i, s) depends on the task at hand, and we decided to evaluate two possible settings. The first one, denoted by decreasing is c(i, s) = s i+1 s , i = 1, . . . , s, the second one, denoted by constant , is c(i, s) = 1, for all i and s. In all experiments with ℓa,c, the a parameter was set to 0.5 [...] We did so by imposing, for all t, an upper bound St = S on | ˆYt|. For each of the three data sets, we tried out the four different values of S reported in the last four columns of Table 1