reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Tractable Representation Learning with Probabilistic Circuits

Authors: Steven Braun, Sahil Sidheekh, Antonio Vergari, Martin Mundt, Sriraam Natarajan, Kristian Kersting

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation demonstrates that APCs outperform existing PC-based autoencoding methods in reconstruction quality, generate embeddings competitive with, and exhibit superior robustness in handling missing data compared to neural autoencoders.
Researcher Affiliation	Academia	Steven Braun EMAIL Technische Universität Darmstadt, Germany Sahil Sidheekh EMAIL University of Texas at Dallas, United States Antonio Vergari EMAIL University of Edinburgh, United Kingdom Martin Mundt EMAIL Universität Bremen, Germany Sriraam Natarajan EMAIL University of Texas at Dallas, United States Kristian Kersting EMAIL Technische Universität Darmstadt, Germany
Pseudocode	Yes	Appendix Algorithm 1 outlines the process of data-free knowledge distillation from a VAE to an APC of similar capacity and the same decoder architecture. The procedure iteratively refines the APC to learn from the VAE without the need for original training data. Algorithm 1 APC Encoding Procedure. Algorithm 2 Data-Free Knowledge Distillation from VAE to APC. Algorithm 3 Sampling sum units with SIMPLE.
Open Source Code	Yes	Our implementation is available as open-source software at https: //github.com/ml-research/autoencoding-probabilistic-circuits.
Open Datasets	Yes	We evaluate our models on various image and tabular datasets. For image data, we include MNIST (Le Cun et al., 1998), Fashion MNIST (F-MNIST) (Xiao et al., 2017), CIFAR-10 (Krizhevsky, 2009), Celeb A (Liu et al., 2015), SVHN (Netzer et al., 2011), Flowers (Gurnani et al., 2017), LSUN (Church) (Yu et al., 2015), and Tiny-Image Net (Deng et al., 2009). Note that Tiny-Image Net is occasionally abbreviated as Image Net in our tables for brevity. For tabular data, we utilize the 20 datasets from the binary density estimation benchmark DEBD (Lowd & Davis, 2010; Haaren & Davis, 2012; Bekker et al., 2015; Larochelle & Murray, 2011).
Dataset Splits	Yes	We evaluate our models on various image and tabular datasets. For image data, we include MNIST (Le Cun et al., 1998), Fashion MNIST (F-MNIST) (Xiao et al., 2017), CIFAR-10 (Krizhevsky, 2009), Celeb A (Liu et al., 2015), SVHN (Netzer et al., 2011), Flowers (Gurnani et al., 2017), LSUN (Church) (Yu et al., 2015), and Tiny-Image Net (Deng et al., 2009). Note that Tiny-Image Net is occasionally abbreviated as Image Net in our tables for brevity. For tabular data, we utilize the 20 datasets from the binary density estimation benchmark DEBD (Lowd & Davis, 2010; Haaren & Davis, 2012; Bekker et al., 2015; Larochelle & Murray, 2011).
Hardware Specification	Yes	The experiments were conducted on a single NVIDIA A100 GPU.
Software Dependencies	No	All experiments and models are implemented in Py Torch (Ansel et al., 2024) and Py Torch Lightning (Falcon & The Py Torch Lightning team, 2019). (The paper mentions software frameworks with citations but does not provide explicit version numbers for these frameworks.)
Experiment Setup	Yes	Each model is trained for 10,000 iterations using the Adam W optimizer (Kingma & Ba, 2015; Loshchilov & Hutter, 2017), and convergence was confirmed for all models by the end of this training period. We use the MSE as LREC for all models. Training is carried out with a batch size of 512, except for Celeb A where a batch size of 256 is used due to larger model sizes and VRAM constraints. The initial learning rate is set to 0.1 for APCs and 0.005 for AEs and VAEs. The learning rate is reduced by a factor of 10 at 66% and 90% of the training progress. Additionally, to enhance stability and avoid numerical issues or exploding gradients during the training phase, we utilize an exponential learning rate warmup over the first 2% of training iterations.