reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unique Sharp Local Minimum in L1-minimization Complete Dictionary Learning

Authors: Yu Wang, Siqi Wu, Bin Yu

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulation studies show that DL-BCD has competitive performance in terms of recovery rate compared to other state-of-the-art dictionary learning algorithms when the reference dictionary is generated from random Gaussian matrices. (Abstract) ... In this section, we evaluate the proposed algorithms with numerical simulations. We will study the empirical running time of Algorithm 1 in the first experiment and examine how the perturbation parameter ρ affects its performance in the second. In the third experiment, we study the sample size requirement for successful recovery of the reference dictionary. Finally, we will compare Algorithm 2 against other state-of-the-art dictionary learning algorithms (Parker et al., 2014a,b; Parker and Schniter, 2016).
Researcher Affiliation	Collaboration	Yu Wang EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720-1776, USA. Siqi Wu EMAIL Citadel Securities 131 South Dearborn Chicago, IL 60603, USA. Bin Yu EMAIL Department of Statistics and EECS University of California, Berkeley Berkeley, CA 94720-1776, USA.
Pseudocode	Yes	Algorithm 1 Sharp local minimum test for ℓ1-minimization dictionary learning ... Algorithm 2 Dictionary Learning Block Coordinate Descent (DL-BCD)
Open Source Code	Yes	The source code of the DL-BCD algorithm can be found in the github repository1. 1. https://github.com/shifwang/dl-bcd
Open Datasets	No	We generate n = 100K samples using a noisy linear model: x(i) = D α(i) + ϵ(i), i = 1, . . . , n. The reference dictionary D , the reference coeﬃcients α(i), and the noise ϵ(i) are generated as follows. Generation of D : First, we randomly generate a random Gaussian matrix X RK K where Xjk N(0, 1). We then scale columns of X to get columns of the reference dictionary D j = Xj/ Xj 2 for j = 1, . . . , K. Generation of α(i): We generate the reference coeﬃcient from sparse Gaussian distribution with sparsity s: α(i) SG(s) for i = 1, . . . , n. Generation of ϵ(i): We generate ϵ(i) using a Gaussian distribution with mean zero. The variance of the distribution is set such that the signal-to noise ratio is 100: E ϵ(1) 2 = 102.
Dataset Splits	No	The paper uses synthetic data generated according to a specified model, rather than pre-existing datasets with defined splits. No mention of training/test/validation splits for any dataset is provided.
Hardware Specification	Yes	The ﬁrst two less computationally intensive simulations are run on an Open Su SE OS with Intel(R) Core(TM) i5-5200U CPU 2.20GHz with 12GB memory, while the last two simulations are conducted in a cluster with 20 cores.
Software Dependencies	No	The implementation of these algorithms is available in the MATLAB package Bi G-AMP (Parker et al., 2014a,b). ... We use Broyden Fletcher Goldfarb Shanno (BFGS) algorithm (Witzgall and Fletcher, 1989)... The paper mentions MATLAB and the BFGS algorithm but does not specify version numbers for any software libraries or tools used for their own implementation, other than the operating system 'Open Su SE OS' which is not a software dependency in this context.
Experiment Setup	Yes	We set dictionary dimension K = 20, sparsity parameter s = 10 and sample size n = 1600. Perturbation level is set at ρ = 0.01 and the threshold T = 10 6. ... The outer loop that performs EM iterations is allowed up to 20 iterations. The inner loop is allowed a minimum of 30 and a maximum of 1500 iterations. ... The number of iterations is set to be 1000. The enforced sparsity is set to be the same as the true sparsity of the underlying model s. ... The number of iterations is set to be 1000 and the penalty parameter in front of the L1 norm is λ = .1/ ... Our algorithm has an outer loop and an inner loop. The outer loop is set to be at most 3. The inner loop is allowed a maximum of 100 iterations. τ is either or 0.5.