A Truncated EM Approach for Spike-and-Slab Sparse Coding

Authors: Abdul-Saboor Sheikh, Jacquelyn A. Shelton, Jörg Lücke

JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate the performance of the GSC algorithm on artificial data as well as various realistic source separation and denoising benchmarks. For all experiments the algorithm was implemented to run in parallel on multiple CPUs with no dependency on their arrangement as physically collocated arrays with shared memory or distributed among multiple compute nodes (see Bornschein et al., 2010, for more details). We further extended the basic technique to make our implementation more efficient and suitable for parallelization by applying the batch execution (the observation discussed in Section 4.1 on Computational Complexity and Appendix C). In all the experiments, the GSC model parameters were randomly initialized.2 The choice of GSC truncation parameters H and γ is in general straight-forward: the larger they are the closer the match to exact EM but the higher are also the computational costs. However, empirically we observed that often much smaller values were sufficient than those that are maximally affordable.3 Note that factored variational approaches do not usually offer such a trade-offbetween the exactness and computational demand of their inference schemes by means of a simple parameter adjustment. 5.1 Reliability of the Selection Function To assess the reliability of the selection function we perform a number of experiments on small scale artificial data generated by the model, such that we can compute both the exact (13) and truncated (23) posteriors. To control for the quality of the truncated posterior approximation and thus the selection function we compute the ratio between posterior mass within the truncated space Kn and the overall posterior mass (compare L ucke and Eggert, 2010): z p( s, z | y (n), Θ) d z P s R z p( s , z | y (n), Θ) d z = P s Kn B( s; π) N( y (n); W s µ, C s) P s B( s ; π) N( y (n); W s µ, C s ) , (26) where the integrals over the latent z in (26) are again given in closed-form. The metric Q(n) ranges from zero to one and is directly related to the KL-divergence between the approximation qn in Equation (23) and the true posterior: DKL(qn( s, z; Θ), p( s, z | y (n), Θ)) = log(Q(n)) .
Researcher Affiliation Academia Abdul-Saboor Sheikh EMAIL Jacquelyn A. Shelton EMAIL Faculty of Electrical Engineering and Computer Science Technical University Berlin Marchstr. 23, 10587 Berlin, Germany J org L ucke EMAIL Cluster of Excellence Hearing4all and Faculty VI University of Oldenburg 26115 Oldenburg, Germany
Pseudocode No The paper describes algorithms (EM and Truncated EM) using mathematical derivations and prose, but does not present them in a structured pseudocode or algorithm block.
Open Source Code No The paper states: 'For later numerical experiments on the model, we used the source code provided along with that publication.1' with Footnote 1: 'We downloaded the code from http://www.well.ox.ac.uk/~mtitsias/code/var Sparse Code.tar.gz.' This refers to the MTMKL algorithm's code, which is a method they compare against, not the source code for their own 'GSC' methodology described in this paper.
Open Datasets Yes Figure 5 shows the performance of both the methods based on three different source separation benchmarks obtained from (ICALAB; Cichocki et al., 2007). ... Finally, we investigate performance of the GSC algorithm on the standard house benchmark for denoising which has been used for the evaluation of similar approaches (e.g., Li and Liu, 2009; Zhou et al., 2009) including the MTMKL spike-and-slab approach.
Dataset Splits No The paper mentions using specific numbers of data points (e.g., 'we use N = 200 and N = 500 data points from each benchmark') and generating patches (e.g., 'we generated 62, 001 overlapping (shifted by 1 pixel) 8 x 8 patches'), but it does not specify explicit training, validation, or test splits for the experiments to enable reproduction of data partitioning.
Hardware Specification No For all experiments the algorithm was implemented to run in parallel on multiple CPUs with no dependency on their arrangement as physically collocated arrays with shared memory or distributed among multiple compute nodes (see Bornschein et al., 2010, for more details). ... Each trial was run in parallel on 24 computing nodes. The paper mentions 'multiple CPUs' and '24 computing nodes' but does not provide specific details such as CPU models, clock speeds, memory, or GPU specifications.
Software Dependencies No The paper mentions using 'ICALAB-MATLAB Toolbox Version 3, 2007' for benchmarks and downloading code for the MTMKL approach from a URL, but it does not specify any software libraries or frameworks with version numbers that were used for the implementation of the GSC algorithm described in the paper.
Experiment Setup Yes The GSC model parameters were randomly initialized.2 The choice of GSC truncation parameters H and γ is in general straight-forward: the larger they are the closer the match to exact EM but the higher are also the computational costs. ... We then applied 65 iterations of the GSC algorithm for H {64, 256} for different noise levels σ {15, 25, 50}. The truncation parameters H and γ for each run are listed in Table 2. ... Footnote 2: We randomly and uniformly initialized the πh between 0.05 and 0.95. µ was initialized with normally distributed random values and the diagonal of Ψ was initialized with strictly positive uniformly distributed random values. We set Σ to the covariance across the data points, and the elements of W were independently drawn from a normal distribution with zero mean and unit variance. ... Table 2: ... GSC (H=64) (H =10,γ=8) ... GSC (H=256) (H =18,γ=3)