reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neural Reasoning for Sure Through Constructing Explainable Models

Authors: Tiansi Dong, Mateja Jamnik, Pietro Liò

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, HSph NN achieves the symbolic level rigour of syllogistic reasoning and successfully checks both decisions and explanations of Chat GPT (gpt-3.5-turbo and gpt-4o) without errors. We show ways to extend HSph NN for various kinds of logical and Bayesian reasoning, and to integrate it with traditional neural networks seamlessly.
Researcher Affiliation	Academia	Department of Computer Science and Technology, University of Cambridge 15 JJ Thomson Ave, Cambridge, UK EMAIL
Pseudocode	Yes	Algorithm 1: Constraint optimisation. Input: OX, OY, OZ, TZX, TZY Output: OZ 1 Optimise OZ to satisfy TZY (OZ, OY); ... Algorithm 2: The control process of HSph NN Input: Target relations: ψ1(O1, O2), ψ2(O2, O3), ψ3(O3, O1); Output: SAT or UNSAT; 1 Initialise O1, O2, and O3 as coinciding; ...
Open Source Code	Yes	Code and Data https://github.com/gnodisnait/hsphnn
Open Datasets	Yes	Code and Data https://github.com/gnodisnait/hsphnn. Among 256 types of syllogistic reasoning statements, only 24 reasoning types are valid.
Dataset Splits	No	The paper describes a neural network (HSph NN) that performs syllogistic reasoning without requiring training data. It evaluates its performance on the full set of 256 Aristotelian syllogistic reasoning types, which are formal logical statements, not a dataset that typically undergoes training/validation/test splits. Therefore, the concept of dataset splits is not applicable in the context of this paper.
Hardware Specification	Yes	All experiments were conducted on Mac Book Pro Apple M1 Max (10C CPU/24C GPU), 32 GB memory.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned for the implementation of HSph NN. The paper mentions evaluating 'Chat GPT (gpt-3.5-turbo and gpt-4o)', but these are models being tested, not dependencies for the authors' own method.
Experiment Setup	Yes	Three spheres are randomly initialised as coinciding in a Poincar e disk, with the centres O following the uniform distribution and with the Euclidean radius r = 0.3. A Poincar e sphere is computed by setting its Euclidean centre to 0.9 sin( O) and the Euclidean radius to sin2(r). This is to prevent HSph NN from pushing them close to the boundary of the Poincar e disk and having NAN values. We set the learning rate to 0.0001 and the maximum number of epochs M = 1.