reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On CNF Conversion for SAT and SMT Enumeration

Authors: Gabriele Masina, Giuseppe Spallitta, Roberto Sebastiani

JAIR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we investigate both theoretically and empirically the effectiveness of CNF conversions for SAT and SMT enumeration. On the negative side, we show that: (i) Tseitin transformation prevents the solver from producing short partial assignments, thus seriously affecting the effectiveness of enumeration; (ii) Plaisted and Greenbaum transformation overcomes this problem only in part. On the positive side, we prove theoretically and we show empirically that combining Plaisted and Greenbaum transformation with NNF preprocessing upfront which is typically not used in solving can fully overcome the problem and can drastically reduce both the number of partial assignments and the execution time.
Researcher Affiliation	Academia	GABRIELE MASINA , University of Trento, DISI, Italy GIUSEPPE SPALLITTA, Rice University, USA ROBERTO SEBASTIANI, University of Trento, DISI, Italy Authors Contact Information: Gabriele Masina, orcid: 0000-0001-8842-4913, EMAIL, University of Trento, DISI, Trento, Italy; Giuseppe Spallitta, orcid: 0000-0002-4321-4995, EMAIL, Rice University, Houston, Texas, USA; Roberto Sebastiani, orcid: 0000-0002-0989-6101, EMAIL, University of Trento, DISI, Trento, Italy.
Pseudocode	Yes	Algorithm 1 Minimize-Assignment(𝜓𝑖,𝜂𝑖, A) Input: CNF formula 𝜓𝑖, total truth assignment 𝜂𝑖s.t. 𝜂𝑖\|=𝑝𝜓𝑖, set of relevant atoms A: 𝜓𝑖(A B) def= 𝜓 Ó𝑖 1 𝑗=1 𝜇A 𝑗if disjoint, 𝜓𝑖 def= 𝜓if non-disjoint 𝜂𝑖= 𝜂A 𝑖 𝜂B 𝑖 Output: minimal partial truth assignment 𝜇A 𝑖s.t. 𝜇A 𝑖 𝜂B 𝑖\|=𝑝𝜓𝑖 1: 𝜇A 𝑖 𝜂A 𝑖 2: for ℓ 𝜇A 𝑖do 3: if 𝜓𝑖\|𝜇A 𝑖\{ℓ} 𝜂B 𝑖= then 4: 𝜇A 𝑖 𝜇A 𝑖\ {ℓ} 5: return 𝜇A 𝑖
Open Source Code	Yes	Benchmarks, results, and source code are made available online on a Zenodo repository [50, 51]. An updated version of the source code is available at https://github.com/masinag/allsat-cnf.
Open Datasets	Yes	The ISCAS 85 benchmarks [12, 27] as done in [74]. The third one is a set of benchmarks on combinatorial circuits encoded as And-Inverter Graphs (AIGs) used in [21]. ... The WMI benchmarks. WMI problems were generated using the procedure described in [70].
Dataset Splits	No	The paper describes how the benchmark instances were generated or configured (e.g., "constrained 60%, 70%, 80%, 90%, and 100% of the outputs to either 0 or 1", "randomly generated"), but does not specify explicit training, validation, or test dataset splits in terms of percentages, sample counts, or predefined partition files for reproducing the experimental evaluation. The experiments are focused on enumeration, which typically does not involve train/test splits for model training.
Hardware Specification	Yes	All the experiments were run on an Intel Xeon Gold 6238R @ 2.20GHz 28 Core machine with 128 GB of RAM, running Ubuntu Linux 20.04.
Software Dependencies	No	The paper mentions several software tools like "Math SAT", "Tabular All SAT", "Tabular All SMT", and "Py SMT". While "Math SAT5" is mentioned, implying version 5, specific version numbers for "Py SMT", "Tabular All SAT", and "Tabular All SMT" are not explicitly provided in the text. The operating system "Ubuntu Linux 20.04" is specified, but the core software dependencies lack precise versioning.
Experiment Setup	Yes	We ran Math SAT with the option -dpll.allsat_minimize_model=true to enumerate minimal partial assignments. For disjoint and non-disjoint enumeration, we set the options -dpll.allsat_allow_duplicates=false and -dpll.allsat_allow_duplicates=true, respectively. We also set the options -preprocessor.simplification=0 and preprocessor.toplevel_propagation=false to disable several non-validity-preserving preprocessing steps. As discussed in 3.2 and 4, we also set the options -dpll.branching_initial_phase=0 and -dpll.branching_cache_phase=2 to split on the false branch first but enabling phase caching. Tabular All SAT was run with default options, which include branching on the false branch first. Tabular All SMT was run with the same options as Math SAT to disable preprocessing steps, since it uses Math SAT as a backend for theory reasoning. All the experiments were run on an Intel Xeon Gold 6238R @ 2.20GHz 28 Core machine with 128 GB of RAM, running Ubuntu Linux 20.04. For each problem set, we set a timeout of 3600s.