reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ASNets: Deep Learning for Generalised Planning

Authors: Sam Toyer, Sylvie Thiébaux, Felipe Trevizan, Lexing Xie

JAIR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also present a thorough experimental evaluation of ASNets, including a comparison with heuristic search planners on seven probabilistic and deterministic domains, an extended evaluation on over 18,000 Blocksworld instances, and an ablation study.
Researcher Affiliation	Academia	Sam Toyer EMAIL Department of Electrical Engineering and Computer Sciences University of California, Berkeley 2121 Berkeley Way, Berkeley CA 94720-1660, USA Sylvie Thiébaux EMAIL Felipe Trevizan EMAIL Lexing Xie EMAIL College of Engineering and Computer Science The Australian National University 145 Science Road, Canberra ACT 2601, Australia
Pseudocode	Yes	Algorithm 1 Learning ASNet weights θ from a set of training problems Ptrain.
Open Source Code	Yes	Code for our experiments is available on Git Hub.6 https://github.com/qxcv/asnets
Open Datasets	Yes	All instances for this task (as well as Exploding and Deterministic Blocksworld) were generated by the algorithm from Slaney and Thiébaux (2001).
Dataset Splits	Yes	We train Triangle Tireworld policies on three problems of sizes 1–3, and test on 17 problems of sizes 4–20. We train Cosa Nostra Pizza policies on ﬁve problems with 1–5 toll booths, and test on 17 problems with 6–50 toll booths. We train Probabilistic Blocksworld policies on 25 problems with 5–9 blocks, and test on 30 problems with 15–40 blocks.
Hardware Specification	Yes	each run was restricted to a single core of an Intel Xeon Platinum 8175 processor attached to an Amazon AWS r5.12xlarge instance, with 16GB of memory available per run.
Software Dependencies	No	The paper mentions "Ray Tune automated hyperparameter tuning framework (Liaw et al., 2018) and the random forest optimiser from scikit-optimize" and "Adam optimiser", but does not provide specific version numbers for these software packages or libraries.
Experiment Setup	Yes	Our networks have two proposition layers and three action layers (i.e. L = 2), with dh = 16 output channels for each action or proposition module. Training is divided into a series of epochs... More speciﬁcally, at the beginning of each epoch, up to Texplore = 70/\|Ptrain\| trajectories are sampled... Ttrain = 700 batches of network optimisation... minibatch... 64 samples... Adam optimiser (β1 = 0.9, β2 = 0.999, ϵ = 10−8) with a learning rate of 10−3. We apply an ℓ2 regulariser of 2 × 10−4 to prevent weights from exploding, and dropout probability of 0.1 for all layers.