reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dimension-Independent Rates for Structured Neural Density Estimation

Authors: Robert A. Vandermeulen, Wai Ming Tai, Bryon Aragam

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further provide empirical evidence that, in real-world applications, t is often a small constant, thus effectively circumventing the curse of dimensionality. Moreover, for sequential data (e.g., audio or text) exhibiting a similar local dependence structure, our analysis shows a rate of n 1/(t+5), offering further evidence of dimension independence in practical scenarios. [...] 3.2. Experimental Validation While MRFs are extremely well-established in image processing (Li, 2009; Blake et al., 2011), it is nonetheless instructive and informative to experimentally validate these assumptions using natural images. We provide an example using CIFAR-10 here, while similar experiments using the COCO and Google Speech Commands datasets are presented in Appendix E. The top row of Figure 5 shows the grayscale values of pixel (8,8) versus selected other pixels for 100 randomly chosen images from the CIFAR-10 training dataset.
Researcher Affiliation	Collaboration	Robert A. Vandermeulen <EMAIL> [...] The paper does not explicitly list the institutional affiliations for all authors. Robert A. Vandermeulen's email is a personal Gmail account, which could be indicative of an independent researcher or a personal email used for correspondence while affiliated with an academic or industry institution. Without more explicit information for all authors, it's a collaboration because a Gmail address is neither purely academic nor purely industry, and other authors have no affiliations listed, so it is a mix.
Pseudocode	No	The paper describes theoretical results and mathematical formulations, but it does not contain any structured pseudocode or algorithm blocks. The methods are described narratively and mathematically.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to any code repositories.
Open Datasets	Yes	We provide an example using CIFAR-10 here, while similar experiments using the COCO and Google Speech Commands datasets are presented in Appendix E. [...] from the COCO 2014 dataset (Lin et al., 2014). [...] We use the Google Speech Commands Dataset (Warden, 2018)
Dataset Splits	No	The paper mentions using "100 randomly chosen images from the CIFAR-10 training dataset" and "4000 random samples" from COCO, with "100 images with pixel (121,160) nearest to the median were chosen for the conditional plots." For Google Speech Commands, it used "500 randomly selected samples." These describe sample selection for visualization/analysis, not formal training/test/validation splits for model evaluation.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with their versions).
Experiment Setup	No	The paper focuses on theoretical convergence rates and mathematical proofs for neural density estimation. The "Experimental Validation" section describes data analysis and visualization rather than model training experiments with specific hyperparameters (e.g., learning rate, batch size, number of epochs) or training configurations.