reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Instance Correlation Graph-based Naive Bayes

Authors: Chengyuan Li, Liangxiao Jiang, Wenjun Zhang, Liangjun Yu, Huan Zhang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on tens of datasets show that ICGNB significantly outperforms its deserved competitors. Our codes and datasets are available at https://github.com/jiangliangxiao/ICGNB. We design two groups of experiments on 24 real-world datasets and a synthetic dataset, respectively.
Researcher Affiliation	Academia	1School of Computer Science, China University of Geosciences, Wuhan 430074, China 2College of Computer, Hubei University of Education, Wuhan 430074, China 3School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China. Correspondence to: Liangxiao Jiang <EMAIL>.
Pseudocode	Yes	ICGNB can be partitioned into training (ICGNB-training) and classification (ICGNB-classification) algorithms. They are depicted by Algorithms 2 and 3 provided in Appendix A.
Open Source Code	Yes	Our codes and datasets are available at https://github.com/jiangliangxiao/ICGNB.
Open Datasets	Yes	Our codes and datasets are available at https://github.com/jiangliangxiao/ICGNB. From the real-world datasets published by the KEEL1 dataset repository, we choose the whole 24 datasets only containing numerical attributes, which represent a wide range of domains and data characteristics. The detailed description of these datasets is provided in Appendix C. 1https://sci2s.ugr.es/keel/category.php?cat=clas
Dataset Splits	Yes	We compare the classification accuracy (%) of ICGNB with its five competitors and ablation variants on these datasets by running 10 separate stratified hold-out validations. In each validation, we use stratified sampling to split the dataset into a training set (80%) and a testing set (20%).
Hardware Specification	No	The paper does not explicitly mention any specific hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	We implement ICGNB, WANBIA, CFWNB, AG-NBC, AE-NBC and GNB by using Python, respectively. This sentence mentions the software (Python) but does not provide a specific version number for Python or any libraries used.
Experiment Setup	Yes	In ICGNB, the number of iterations P is 500, the learning rate η is 0.01, and σ1i 1000σ2i. In AE-NBC, the number of iterations and the learning rate are the same as those of ICGNB. In AG-NBC, the stopping threshold δ is 0.001 and the regularization factors ε1, ε2 are 1, -1, respectively.