reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The (Un)Scalability of Informed Heuristic Function Estimation in NP-Hard Search Problems

Authors: Sumedh Pendurkar, Taoan Huang, Brendan Juba, Jiapeng Zhang, Sven Koenig, Guni Sharon

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Complementing our theoretical claims, we provide experimental results for three representative NP-hard search problems. The results suggest that fitting deep neural networks to informed heuristic functions requires network sizes that grow quickly with the problem instance size.
Researcher Affiliation	Academia	Sumedh Pendurkar EMAIL Department of Computer Science and Engineering Texas A&M University; Taoan Huang EMAIL Department of Computer Science University of Southern California; Brendan Juba EMAIL Department of Computer Science and Engineering Washington University in St. Louis; Jiapeng Zhang EMAIL Department of Computer Science University of Southern California; Sven Koenig EMAIL Department of Computer Science University of Southern California; Guni Sharon EMAIL Department of Computer Science and Engineering Texas A&M University
Pseudocode	No	The paper describes algorithms like A* and related concepts textually but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	For our code and dataset see https://github.com/Pi-Star-Lab/unscalable-heuristic-approximator.
Open Datasets	Yes	For our code and dataset see https://github.com/Pi-Star-Lab/unscalable-heuristic-approximator. For all domains, the training set (D) included 3,000 random training samples, each of the type (s, h (s)), in cases where the fitting criterion was set to l = 0. When the fitting criterion was relaxed, the training set was set to 8 × 10^5 samples.
Dataset Splits	Yes	For all domains, the training set (D) included 3,000 random training samples, each of the type (s, h (s)), in cases where the fitting criterion was set to l = 0. When the fitting criterion was relaxed, the training set was set to 8 × 10^5 samples. The discrepancy in training set sizes is due to scalability issues when considering fitting to the strict fitting criterion (l = 0). We also sample a separate test set that is used to verify if such estimators suffer from memorization (Question 1 (a)) and thus, minimizing test loss should be not viewed as a goal of the experimental section. For all cases, the test set was set to 2 × 10^5 random samples.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud computing instances used for running the experiments. It mentions 'practical computation limitations' but no specific specifications.
Software Dependencies	No	The paper mentions several software components and algorithms by name (e.g., Adam optimizer, Re LU, residual connections, batch normalization) and cites their original publications, but it does not specify version numbers for any libraries, frameworks, or programming languages used (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We optimize the neural networks using the Adam optimizer (Kingma & Ba, 2015). ... To mitigate the impact of local minima, we train each neural network 5 times and report the lowest loss. The entire dataset could fit in the memory, as a result, batch size was set to the size of the dataset. For the first set of experiments, we use l = 0 (Eq. 1) as the fitting criterion and set a maximum of 3,000 epochs to achieve it. For the second set of experiments, we use lϵ T with c = 0 (i.e., MSE) as the fitting criterion and set a maximum of 300 epochs to achieve it.