Sum of Squares Circuits
Authors: Lorenzo Loconte, Stefan Mengel, Antonio Vergari
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation. ...Finally, we empirically validate the increased expressiveness of sum of squares circuits for distribution estimation, showing they can scale to real-world data when tensorized (Section 7). ...We evaluate structured monotonic (+sd), squared PCs ( 2 R, 2 C), their sums and µSOCS as the product of a monotonic and a SOCS PC (+sd Σ2 cmp, see Definition 5) on distribution estimation tasks using both continuous and discrete real-world data. |
| Researcher Affiliation | Academia | Lorenzo Loconte1, Stefan Mengel2, Antonio Vergari1 1School of Informatics, University of Edinburgh, UK 2University of Artois, CNRS, Centre de Recherche en Informatique de Lens (CRIL), France |
| Pseudocode | No | The paper describes algorithms like the Multiply algorithm conceptually (e.g., "Multiplying two compatible circuits c1, c2 can be done via the Multiply algorithm in time O(|c1||c2|) as described in Vergari et al. (2021) and which we report in Appendix A.1") but does not provide structured pseudocode or algorithm blocks in the main text. |
| Open Source Code | Yes | Code https://github.com/april-tools/sos-npcs |
| Open Datasets | Yes | We estimate the distribution of four continuous UCI data sets: Power, Gas, Hepmass, Mini Boo NE, using the same preprocessing by Papamakarios, Pavlakou, and Murray (2017) (Table C.1). ...We estimate the probability distribution of MNIST, Fashion MNIST and Celeb A images (Table C.2) |
| Dataset Splits | Yes | For all UCI data sets, we preprocess them as in Papamakarios, Pavlakou, and Murray (2017), which includes standard z-normalization and random splits for training, validation, and test sets. Specifically, we use an 80/10/10 split respectively. We use the official splits for MNIST, Fashion MNIST, and CelebA. |
| Hardware Specification | No | The paper mentions that models were trained and experiments were performed, but it does not specify any particular hardware components such as GPU models, CPU models, or memory details. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers, such as Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | Given a training set D = {x(i)}N i=1 on variables X, we are interested in estimating p(X) from D by minimizing the parameters negative log-likelihood on a batch B D, i.e., L := |B| log Z P x B log c(x), via gradient descent. ...For all UCI data sets, we train our models for 500 epochs using the Adam optimizer with a learning rate of 1e-3 and a batch size of 128. |