reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Decomposed Quadratization: Efficient QUBO Formulation for Learning Bayesian Network

Authors: Yuta Shikuri

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on 16 instances, ranging from 37 to 223 variables, demonstrate that our approach requires fewer binary variables than quadratization by orders of magnitude. Moreover, an annealing machine that implement our formulation have outperformed existing algorithms in score maximization.
Researcher Affiliation	Industry	Yuta Shikuri Tokio Marine Holdings, Inc. Tokyo, Japan EMAIL
Pseudocode	Yes	Algorithm 1: Search Space Reduction 1: Wopen F, Vclose , Vopen S V F 2V \ { }. 2: Merge {V Vopen \| for some W Wopen, V is only an element in Vopen such that W S V { } Vclose{V V }} into Vclose. 3: Exclude S V,V Vclose{V V } from Wopen. 4: Exclude Vclose {V Vopen \| there does not exist W Wopen such that V W} from Vopen. Algorithm 2: Candidate Parent Subset Reduction 1: Wopen Fi, Vclose , Vopen S V Fi 2V \ { }. 2: while Vclose has changed since the last evaluation, do 3: Exclude {V Vopen \| V W for at most one element W Wopen} from Vopen. 4: Execute the row from 2 to 4 in algorithm 1. 5: Merge Wopen \ S V,V Vclose Vopen{V V } into Vclose, and exclude it from Wopen. 6: end while 7: Merge Wopen \S V Vopen,V Vclose{V V } into Vopen.
Open Source Code	No	The code was implemented in the Julia programming language 1.5.3 version.
Open Datasets	Yes	As benchmarks, we adopted 5 simulated datasets from each of the 8 discrete networks: alarm (n = 37), barley (n = 48), hailfinder (n = 56), hepar2 (n = 70), win95pts (n = 76), pathfinder (n = 109), munin1 (n = 186), and andes (n = 223) from the Bayesian network repository 2. In addition, we used 5 times non-recoverable extracted datasets from each of the 3 datasets: chess (n = 75), mushroom (n = 117), and connect (n = 129) from the Frequent Item Set Mining Dataset Repository 3.
Dataset Splits	No	The sample size of each dataset is 1000.
Hardware Specification	Yes	All experiments, except for those using the Digital Annealer, were conducted on a 64-bit Windows machine with an Intel Xeon W-2265 @ 3.50 GHz and 128 GB of RAM. The fourth-generation Fujitsu Digital Annealer was employed for score maximization.
Software Dependencies	Yes	Version 9.1.2 of the Gurobi Optimizer 1 was used to solve the integer linear programming problems. The code was implemented in the Julia programming language 1.5.3 version. The version of GOB was the pygobnilp1.0 that relies on the Gurobi Optimizer.
Experiment Setup	Yes	The time limit for each trial was set to 3600 [s]. The coefficients in the penalty terms were δ1 = 1.1δ0, δ2 = 1.1(n 2)δ1, and ξ = 1.1ξ0. We used two heuristic solvers and one exact method: the Digital Annealer (DAQ), classical simulated annealing (SAQ), and the Gurobi Optimizer (GOQ). The parameters for DAQ and GOQ were set to their default values. The annealing schedule of SAQ was geometric, starting with T0 = 100. The tabu list length in HCS was set to 10, and the state of 10 randomly selected possible edges was changed upon termination of the greedy search.