Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Convolutional Neural Networks Analyzed via Convolutional Sparse Coding

Authors: Vardan Papyan, Yaniv Romano, Michael Elad

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 8. Experiments: The Generator Behind the CNN. Consider the following question: can we synthesize signals obeying the ML-CSC model? Throughout this work we have posited that the answer to this question is positive; we have assumed the existence of a set of signals X, which satisfy i Γi = Di+1Γi+1 where {Γi}K i=1 are all ℓ0, bounded. However, a natural question arises as to whether we can give a simple example of set of dictionaries {Di}K i=1 and their corresponding signals X that indeed satisfy our model assumptions. A na ıve attempt would be to choose an arbitrary set of dictionaries, {Di}K i=1, and a random deepest representation, ΓK, and compute the remaining sparse vectors (and the signal itself) using the set of relations Γi = Di+1Γi+1. However, without further restrictions, this would lead to a set of representations {Γi}K i=1 with growing ℓ0, norm as we propagate towards Γ0. A somewhat better approach would be to impose sparsity on the dictionaries involved, as suggested in Section 7.1, thus leading to sparser representations. However, besides the obvious drawback of forcing a limiting structure on the dictionaries, as can be seen in Equation (11), in the worst case this would also lead to growth in the density of the representations, even if it is more controlled. The spatial-stride at first glance unrelated to this discussion is another solution that addresses the same problem. In particular, in Section 7.2 this idea was shown to encourage sparser vectors by forcing zeros in a regular pattern in the set of representations {Γi}K i=1. In this section we combine the above notions in order to achieve our goal generate a set of signals that will satisfy the ML-CSC assumptions. These will then serve as a playground for several experiments, which will compare both theoretically and practically the different pursuits presented in this paper.
Researcher Affiliation Academia Vardan Papyan* EMAIL Department of Computer Science Technion Israel Institute of Technology Technion City, Haifa 32000, Israel Yaniv Romano* EMAIL Department of Electrical Engineering Technion Israel Institute of Technology Technion City, Haifa 32000, Israel Michael Elad EMAIL Department of Computer Science Technion Israel Institute of Technology Technion City, Haifa 32000, Israel
Pseudocode Yes Algorithm 1 The layered thresholding algorithm. Input: X a signal. {Di}K i=1 convolutional dictionaries. P {H, S, S+} a thresholding operator. {βi}K i=1 thresholds. Output: A set of representations {ˆΓi}K i=1. Process: 2: for i = 1 : K do 3: ˆΓi Pβi(DT i ˆΓi 1) Algorithm 2 The layered iterative soft thresholding algorithm. Input: X a signal. {Di}K i=1 convolutional dictionaries. P {S, S+} a soft thresholding operator. {ξi}K i=1 Lagrangian parameters. {1/ci}K i=1 step sizes. {Ti}K i=1 number of iterations. Output: A set of representations {ˆΓi}K i=1. Process: 2: for i = 1 : K do 4: for t = 1 : Ti do 5: ˆΓt i Pξi/ci ˆΓt 1 i + 1 ci DT i ˆΓi 1 DiˆΓt 1 i 7: ˆΓi ˆΓTi i 8: end for
Open Source Code No The paper mentions "License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v18/16-505.html." However, this refers to the licensing of the paper itself and does not provide an explicit statement about the availability of source code for the described methodology or a link to a code repository.
Open Datasets No The paper does not use publicly available datasets but rather generates synthetic signals for its experiments. Section 8.1 details the process: "In the first layer we choose this filter to be the analytically defined discrete Meyer Wavelet of length n0 = 29. ... we generate a filter of length 20 with 7 non-zero entries belonging to the set { 8, 7, ..., 7, 8} (these are the entries before the atom is normalized to a unit ℓ2 norm). In practice, this is done by sampling random vectors satisfying these constraints and choosing one resulting in a good mutual coherence." And in 8.2: "We now move to the task of sampling a signal when the number of layers is K = 3. First, we draw a random Γ3 of length 100 with an ℓ0 norm in the range [20, 66] and set each nonzero coefficient in it to 1, with equal probability."
Dataset Splits No The paper generates synthetic signals for its experiments and does not use a traditional dataset with predefined training, validation, or test splits. The experiments involve sampling 'realizations' of signals based on specific parameters, as described in Section 8.2: "For every signal X (termed realization below), we employ the layered hard thresholding algorithm."
Hardware Specification No The paper does not provide specific details regarding the hardware used for running its experiments. It lacks mentions of specific GPU/CPU models, processor types, or memory amounts.
Software Dependencies No The paper does not provide specific details regarding software dependencies with version numbers used for running its experiments. It does not list any programming languages, libraries, or solvers with their respective versions.
Experiment Setup Yes The paper provides extensive details about the experimental setup for generating signals and testing algorithms. For example, in Section 8.1 and 8.2: "In our experiments, the signal is one dimensional and therefore m0 = 1. ... In the first layer we choose this filter to be the analytically defined discrete Meyer Wavelet of length n0 = 29. In order to obtain sparser representations and improve the coherence of the global dictionary D1, we employ a stride of s0 = 6, resulting in µ(D1) = 2.44 10 4. ... we generate a filter of length 20 with 7 non-zero entries belonging to the set { 8, 7, ..., 7, 8} ... For simplicity, all {Di}K i=2 are created from the very same local atom, i.e. ni = 20 1 i K 1. Moreover, in all the dictionaries this atom is shifted by a stride of si = 6, leading to µ(Di) = 4.33 10 3. ... when the number of layers is K = 3. First, we draw a random Γ3 of length 100 with an ℓ0 norm in the range [20, 66] and set each nonzero coefficient in it to 1, with equal probability." It also specifies noise addition: "Next, we contaminate each signal X with a zero-mean white additive Gaussian noise E, creating a signal Y = X + E. The average SNR of the obtained noisy signals is 68.53 d B." And parameter settings for algorithms: "As such, the parameters for every algorithm are chosen according to our theoretical study. ... For comparison, we run the layered BP with hand-picked ξi and present the obtained results in the same figure."