reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentiable Logic Machines

Authors: Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui JIANG, Jianyi Zhang, Paul Weng, Dong Li, Jianye HAO, Wulong Liu

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed four series of experiments. The first (resp. second) series evaluate DLM with SOTA baselines on ILP (resp. RL) tasks. The third series correspond to an ablation study that justifies the different components (i.e., critic, Gumbel-softmax, dropout) of our method. The last series present a comparison of the methods in terms of computational costs.
Researcher Affiliation	Collaboration	Matthieu Zimmer EMAIL Huawei Noah s Ark Lab, London, United Kingdom. Xuening Feng EMAIL UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China. Claire Glanois EMAIL IT University of Copenhagen, Copenhagen, Denmark. Zhaohui Jiang EMAIL UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China. Jianyi Zhang EMAIL UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China. Paul Weng EMAIL UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China. Dong Li EMAIL Huawei Noah s Ark Lab, China. Jianye Hao EMAIL Huawei Noah s Ark Lab, China. School of Computing and Intelligence, Tianjin University, China. Wulong Liu EMAIL Huawei Noah s Ark Lab, Canada.
Pseudocode	Yes	Algorithm 2 Supervised training of DLM, Algorithm 3 RL training of DLM, Algorithm 4 Incremental training of DLM
Open Source Code	No	The text discusses source code for other methods (NLM, d NL-ILP, and a GitHub link for d NL-ILP) but does not provide concrete access to source code for the methodology described in this paper (DLM).
Open Datasets	No	The paper describes tasks like 'Blocks World', 'Sorting', 'Path', 'Family Tree' and 'Graph Reasoning'. For Family Tree and Graph Reasoning, it states that instances are 'randomly generated from the same data generator in NLM and DLM'. For RL tasks, it refers to 'three Blocks World tasks from Jiang & Luo (2019)' and 'three other tasks ... from Dong et al. (2019)', implying the data setup might originate from these works, but does not provide concrete access information (link, DOI, specific citation for datasets) for its own experimental data.
Dataset Splits	No	The paper mentions testing performance over '100 instances' or '250 random instances' and describes evaluation on different numbers of constants ('m 10' and 'M 50'). It also states that ILP instances are 'randomly generated'. However, it does not provide specific training/validation/test dataset splits (e.g., percentages, sample counts, or predefined split references) for reproduction.
Hardware Specification	Yes	The experiments are ran by one CPU thread and one GPU unit on the computer with specifications shown in Table 6. Attribute Specification CPU 2 ˆ Intel(R) Xeon(R) CPU E5-2678 v3 Threads 48 Memory 64GB (4ˆ16GB 2666) GPU 4 ˆ Ge Force GTX 1080 Ti
Software Dependencies	No	The paper mentions using ADAM as an optimizer and PPO as an algorithm, but it does not specify any programming language versions (e.g., Python, C++), or library/framework versions (e.g., PyTorch, TensorFlow, scikit-learn) with specific version numbers required to replicate the experiments.
Experiment Setup	Yes	We have used ADAM with learning rate of 0.005, 5 trajectories, with a clip of 0.2 in the PPO loss, λ 0.9 in GAE and a value function clipping of 0.2. For the softmax over the action distribution, we used a temperature of 0.01. Table 10: Hyperparameters of the noise in DLM. Starting from Exponential decay Approximate final value SL temperature τ of Gumbel dist. 1 0.995 0.5 scale β of Gumbel dist. 1 0.98 0.005 dropout probability 0.1 0.98 0.0005 RL temperature τ of Gumbel distr. 1 0.995 task-dependent scale β of Gumbel dist. 0.1 0.98 task-dependent dropout probability 0.01 0.98 task-dependent