reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Truncated EM Approach for Spike-and-Slab Sparse Coding

Authors: Abdul-Saboor Sheikh, Jacquelyn A. Shelton, Jörg Lücke

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the performance of the GSC algorithm on artiﬁcial data as well as various realistic source separation and denoising benchmarks. For all experiments the algorithm was implemented to run in parallel on multiple CPUs with no dependency on their arrangement as physically collocated arrays with shared memory or distributed among multiple compute nodes (see Bornschein et al., 2010, for more details). We further extended the basic technique to make our implementation more eﬃcient and suitable for parallelization by applying the batch execution (the observation discussed in Section 4.1 on Computational Complexity and Appendix C). In all the experiments, the GSC model parameters were randomly initialized.2 The choice of GSC truncation parameters H and γ is in general straight-forward: the larger they are the closer the match to exact EM but the higher are also the computational costs. However, empirically we observed that often much smaller values were suﬃcient than those that are maximally aﬀordable.3 Note that factored variational approaches do not usually oﬀer such a trade-oﬀbetween the exactness and computational demand of their inference schemes by means of a simple parameter adjustment. 5.1 Reliability of the Selection Function To assess the reliability of the selection function we perform a number of experiments on small scale artiﬁcial data generated by the model, such that we can compute both the exact (13) and truncated (23) posteriors. To control for the quality of the truncated posterior approximation and thus the selection function we compute the ratio between posterior mass within the truncated space Kn and the overall posterior mass (compare L ucke and Eggert, 2010): z p( s, z \| y (n), Θ) d z P s R z p( s , z \| y (n), Θ) d z = P s Kn B( s; π) N( y (n); W s µ, C s) P s B( s ; π) N( y (n); W s µ, C s ) , (26) where the integrals over the latent z in (26) are again given in closed-form. The metric Q(n) ranges from zero to one and is directly related to the KL-divergence between the approximation qn in Equation (23) and the true posterior: DKL(qn( s, z; Θ), p( s, z \| y (n), Θ)) = log(Q(n)) .
Researcher Affiliation	Academia	Abdul-Saboor Sheikh EMAIL Jacquelyn A. Shelton EMAIL Faculty of Electrical Engineering and Computer Science Technical University Berlin Marchstr. 23, 10587 Berlin, Germany J org L ucke EMAIL Cluster of Excellence Hearing4all and Faculty VI University of Oldenburg 26115 Oldenburg, Germany
Pseudocode	No	The paper describes algorithms (EM and Truncated EM) using mathematical derivations and prose, but does not present them in a structured pseudocode or algorithm block.
Open Source Code	No	The paper states: 'For later numerical experiments on the model, we used the source code provided along with that publication.1' with Footnote 1: 'We downloaded the code from http://www.well.ox.ac.uk/~mtitsias/code/var Sparse Code.tar.gz.' This refers to the MTMKL algorithm's code, which is a method they compare against, not the source code for their own 'GSC' methodology described in this paper.
Open Datasets	Yes	Figure 5 shows the performance of both the methods based on three diﬀerent source separation benchmarks obtained from (ICALAB; Cichocki et al., 2007). ... Finally, we investigate performance of the GSC algorithm on the standard house benchmark for denoising which has been used for the evaluation of similar approaches (e.g., Li and Liu, 2009; Zhou et al., 2009) including the MTMKL spike-and-slab approach.
Dataset Splits	No	The paper mentions using specific numbers of data points (e.g., 'we use N = 200 and N = 500 data points from each benchmark') and generating patches (e.g., 'we generated 62, 001 overlapping (shifted by 1 pixel) 8 x 8 patches'), but it does not specify explicit training, validation, or test splits for the experiments to enable reproduction of data partitioning.
Hardware Specification	No	For all experiments the algorithm was implemented to run in parallel on multiple CPUs with no dependency on their arrangement as physically collocated arrays with shared memory or distributed among multiple compute nodes (see Bornschein et al., 2010, for more details). ... Each trial was run in parallel on 24 computing nodes. The paper mentions 'multiple CPUs' and '24 computing nodes' but does not provide specific details such as CPU models, clock speeds, memory, or GPU specifications.
Software Dependencies	No	The paper mentions using 'ICALAB-MATLAB Toolbox Version 3, 2007' for benchmarks and downloading code for the MTMKL approach from a URL, but it does not specify any software libraries or frameworks with version numbers that were used for the implementation of the GSC algorithm described in the paper.
Experiment Setup	Yes	The GSC model parameters were randomly initialized.2 The choice of GSC truncation parameters H and γ is in general straight-forward: the larger they are the closer the match to exact EM but the higher are also the computational costs. ... We then applied 65 iterations of the GSC algorithm for H {64, 256} for diﬀerent noise levels σ {15, 25, 50}. The truncation parameters H and γ for each run are listed in Table 2. ... Footnote 2: We randomly and uniformly initialized the πh between 0.05 and 0.95. µ was initialized with normally distributed random values and the diagonal of Ψ was initialized with strictly positive uniformly distributed random values. We set Σ to the covariance across the data points, and the elements of W were independently drawn from a normal distribution with zero mean and unit variance. ... Table 2: ... GSC (H=64) (H =10,γ=8) ... GSC (H=256) (H =18,γ=3)