The Self-Normalized Estimator for Counterfactual Learning

Authors: Adith Swaminathan, Thorsten Joachims

NeurIPS 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the empirical effectiveness of Norm POEM on several multi-label classification problems, finding that it consistently outperforms the conventional estimator.
Researcher Affiliation Academia Adith Swaminathan Department of Computer Science Cornell University EMAIL Thorsten Joachims Department of Computer Science Cornell University EMAIL
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Software implementing Norm-POEM is available at http://www.cs.cornell.edu/~adith/POEM.
Open Datasets Yes The experiment setup uses supervised datasets for multi-label classification from the Lib SVM repository. In these datasets, the inputs x Rp. The predictions y {0, 1}q are bitvectors indicating the labels assigned to x. The datasets have a range of features p, labels q and instances n: Name p(# features) q(# labels) ntrain ntest Scene 294 6 1211 1196 Yeast 103 14 1500 917 TMC 30438 22 21519 7077 LYRL 47236 4 23149 781265
Dataset Splits Yes Hyper-parameters λ, M were calibrated as recommended and validated on a 25% hold-out of D in summary, our experimental setup is identical to POEM [1].
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or specific computational resources.
Software Dependencies No The paper mentions that 'CRF is implemented by scikit-learn [27]', but it does not specify the version number of scikit-learn or any other software dependencies.
Experiment Setup Yes Hyper-parameters λ, M were calibrated as recommended and validated on a 25% hold-out of D in summary, our experimental setup is identical to POEM [1]. and To simulate a bandit feedback dataset D, we use a CRF with default hyper-parameters trained on 5% of the supervised dataset as h0, and replay the training data 4 times and collect sampled labels from h0. and Since the choice of optimization method could be a confounder, we use L-BFGS for all methods and experiments.