reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Overcomplete, Low Coherence Dictionaries with Linear Inference

Authors: Jesse A. Livezey, Alejandro F. Bujan, Friedrich T. Sommer

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare ICA algorithms to the computationally more expensive sparse coding on synthetic data, we show that the limited applicability of overcomplete, linear inference can be extended with the proposed cost functions. Finally, when trained on natural images, we show that the coherence control biases the exploration of the data manifold, sometimes yielding suboptimal, coherent solutions. ... We evaluate the coherence and diversity of bases learned on a dataset of natural image patches.
Researcher Affiliation	Academia	Jesse A. Livezey EMAIL Biological Systems and Engineering Division Lawrence Berkeley National Laboratory Berkeley, California 94720, USA Redwood Center for Theoretical Neuroscience University of California, Berkeley Berkeley, California 94720, USA Alejandro F. Bujan EMAIL Friedrich T. Sommer EMAIL Redwood Center for Theoretical Neuroscience University of California, Berkeley Berkeley, California 94720, USA
Pseudocode	No	The paper describes methods and mathematical formulations but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	A repository with code to reproduce the results is available2. 2. https://github.com/JesseLivezey/oc_ica A repository with code to ﬁt the Gabor kernels is posted online 3. 3. https://github.com/JesseLivezey/gabor_fit
Open Datasets	Yes	Images were taken from the Van Hateren database (van Hateren and van der Schaaf, 1998). We selected images where there was no evident motion blur and minimal saturated pixels. 8-by-8 patches were taken from these images and whitened using PCA.
Dataset Splits	No	The paper mentions generating synthetic k-sparse and analysis datasets, and using natural image patches from the Van Hateren database. It describes the number of samples relative to model parameters and the patch size (8x8), but does not provide specific training, validation, or test splits (e.g., percentages or counts).
Hardware Specification	Yes	We acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.
Software Dependencies	No	All models were implemented in Theano (Theano Development Team, 2016). ICA models were trained using the L-BFGS-B (Byrd et al., 1995) implementation in Sci Py (Jones et al., 2001 2017). FISTA (Beck and Teboulle, 2009) was used for MAP inference in the sparse coding model and the weights were learned using L-BFGS-B. While software packages like Theano and SciPy are mentioned with publication years, specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	For a 32-dimensional data space, we vary the k-sparseness and overcompleteness of the data. For each of these datasets, where the number of dataset samples was 10-times the mixing matrix dimensionality, we ﬁt all models to the data from 10 random initializations, for a range of sparsity weights: λ, if applicable, and then compare the recovery metric across models. ... We train 2-times overcomplete ICA models on 8-by-8 whitened image patches from the Van Hateren database ... at a ﬁxed value of sparsity across costs found by binary search on λ. ... ICA models were trained using the L-BFGS-B ... FISTA ... was used for MAP inference in the sparse coding model ... All weights were training with the norm-ball projection.