Jointly Informative Feature Selection Made Tractable by Gaussian Modeling

Authors: Leonidas Lefakis, François Fleuret

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An empirical evaluation using several types of classifiers on multiple data sets show that this class of methods outperforms state-of-the-art baselines, both in terms of speed and classification accuracy. Keywords: feature selection, mutual information, entropy, mixture of Gaussians. In this section we present an empirical evaluation of the proposed algorithms. We first show on a synthetic controlled experiment that they behave as expected regarding groups of jointly informative features, and then provide results obtained on three popular real-world computer vision data sets.
Researcher Affiliation Collaboration Leonidas Lefakis EMAIL Zalando Research Zalando SE Berlin, Germany. Fran cois Fleuret EMAIL Computer Vision and Learning group Idiap Research Institute Martigny, Switzerland.
Pseudocode Yes Table 2: Greedy Forward Subset Selection S0 for n = 1 . . . N do s = 0 for Xj F \ Sn 1 do S Sn 1 Xj s I(S ; Y ) if s > s then end if end for Si S end for return SN
Open Source Code No The paper mentions using 'the code provided by the authors' for pre-processing external datasets (Coates and Ng, 2011) and discusses its own 'C++ implementations' for performance comparison. However, there is no explicit statement or link indicating that the authors have made their own code publicly available for the methodology described in this paper.
Open Datasets Yes We report results on three standard computer vision data-sets which we used for our experiments: CIFAR-10 contains images of size 32 32 of 10 distinct classes depicting vehicles and animals. ... INRIA is a pedestrian detection data set. ... STL-10 consists of images of size 96 96 belonging to 10 classes, each represented by 500 training images. As for CIFAR we pre-process the data as in (Coates and Ng, 2011), resulting in a pool F of 4, 096 features.
Dataset Splits No The paper states: 'CIFAR-10... The training data consists of 5, 000 images of each class.' and 'STL-10... each represented by 500 training images.' and 'INRIA... 12, 180 training images'. It also discusses 'selecting uniformly at random without replacement' for a finite sample analysis. However, it does not explicitly provide the training/test/validation splits (e.g., percentages or specific counts) for the main experimental results, nor does it refer to standardized splits for all datasets with citations.
Hardware Specification No The paper states: 'The computation times provided were obtained with C++ implementations of the proposed methods.' While it discusses CPU time in Table 3, it does not specify any particular CPU model, GPU, or other hardware details (e.g., processor type, memory amount) used for running the experiments.
Software Dependencies No The paper mentions the use of C++ implementations for the proposed methods and MRMR, MATLAB for Spectral and CMTF baselines, and Java for other algorithms. However, it does not provide specific version numbers for any of these programming languages, libraries, or frameworks to ensure reproducibility.
Experiment Setup No The paper mentions combining selected features with 'four different classifiers: Ada Boost with classification stumps, linear SVM, RBF-kernel SVM, and quadratic discriminant analysis (QDA)'. It also states results are shown for 'several numbers of selected features {10, 25, 50, 100}'. However, it does not provide specific hyperparameters for these classifiers (e.g., learning rates, C values, kernel parameters, number of boosting rounds) or other detailed training configurations required for reproducibility.