reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Learning ISTA

Authors: Fabio Valerio Massoli, Christos Louizos, Arash Behboodi

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our models performance by comparing them against classical and ML-based baselines on three datasets: MNIST, CIFAR10, and a synthetic dataset. Concerning the synthetic dataset, we follow a similar approach as in Chen et al. (2018); Liu & Chen (2019); Behrens et al. (2021). However, in contrast to the mentioned works, we generate a different Φ matrix for each datum by sampling i.i.d. entries from a standard Gaussian distribution. We generate the ground truth sparse signals by sampling the entries from a standard Gaussian and setting each entry to be non-zero with a probability of 0.1. We generate 5K samples and use 3K for training, 1K for model selection, and 1K for testing. Concerning the MNIST and CIFAR10, we train the models using the full images, without applying any crop. For CIFAR10, we gray-scale and normalize the images. We generate the corresponding observations, yi, by multiplying each sensing matrix with the ground truth image: yi = Φisi. We compare the A-DLISTA and VLISTA models against classical and ML baselines. Our classical baselines use the ISTA algorithm, and we pre-compute the dictionary by either considering the canonical or the wavelet basis or using the SPCA algorithm. Our ML baselines use different unfolded learning versions of ISTA, such as LISTA. To demonstrate the benefits of adaptivity, we perform an ablation study on A-DLISTA by removing its augmentation network and making the parameters θt, γt learnable only through backpropagation. We refer to the non-augmented version of A-DLISTA as DLISTA. Therefore, for DLISTA, θt and γt cannot be adapted to the specific input sensing matrix. Moreover, we consider BCS (Ji et al., 2008) as a specific Bayesian baseline for VLISTA. Finally, we conduct Out-Of-Distribution (OOD) detection experiments. We fixed the number of layers to three for all ML models to compare their performance. The classical baselines do not possess learnable parameters. Therefore, we performed an extensive grid search to find the best hyperparameters for them. More details concerning the training procedure and ablation studies can be found in Appendix D and Appendix F.
Researcher Affiliation	Industry	Fabio Valerio Massoli EMAIL Qualcomm AI Research Christos Louizos EMAIL Qualcomm AI Research Arash Behboodi EMAIL Qualcomm AI Research
Pseudocode	Yes	Algorithm 1 Augmented Dictionary Learning ISTA (A-DLISTA) Inference Algorithm Algorithm 2 Variational Learning ISTA (VLISTA) Inference Algorithm
Open Source Code	No	The paper does not provide any explicit statement about releasing code or a link to a code repository. The OpenReview link is for peer review, not code release.
Open Datasets	Yes	We evaluate our models performance by comparing them against classical and ML-based baselines on three datasets: MNIST, CIFAR10, and a synthetic dataset. Concerning the synthetic dataset, we follow a similar approach as in Chen et al. (2018); Liu & Chen (2019); Behrens et al. (2021). However, in contrast to the mentioned works, we generate a different Φ matrix for each datum by sampling i.i.d. entries from a standard Gaussian distribution. We generate the ground truth sparse signals by sampling the entries from a standard Gaussian and setting each entry to be non-zero with a probability of 0.1.
Dataset Splits	Yes	We generate 5K samples and use 3K for training, 1K for model selection, and 1K for testing. ... First, we split the dataset into two distinct subsets the In-Distribution (ID) set and the OOD. The ID set comprises images from three randomly chosen digits, while the OOD set includes images of the remaining digits. Then, we partitioned the ID set into training, validation, and test sets for VLISTA.
Hardware Specification	Yes	The average inference time was estimated by testing over 1000 batches containing 32 data points using a Ge Force RTX 2080 Ti.
Software Dependencies	No	The paper mentions using the Adam optimizer, but does not specify any software versions for libraries (e.g., Python, PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	Using the Adam optimizer, we train the reconstruction and augmentation models for A-DLISTA jointly. We set the initial learning rate to 1.e 2 and 1.e 3 for the reconstruction and augmentation network, respectively, and we drop its value by a factor of 10 every time the loss stops improving. Additionally, we set the weight decay to 5.e 4 and the batch size to 128. We use Mean Squared Error (MSE) as the objective function for all datasets. We train all the components of VLISTA using the Adam optimizer, which is similar to A-DLISTA. We set the learning rate to 1.e 3 and drop its value by a factor of 10 every time the loss stops improving. Regarding the objective function, we maximize the ELBO and set the weight for KL divergence to 1.e 3. ... We fixed the number of layers to three for all ML models to compare their performance.