reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neural Spacetimes for DAG Representation Learning

Authors: Haitz Sáez de Ocáriz Borde, Anastasis Kratsios, Marc T Law, Xiaowen Dong, Michael Bronstein

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our framework computationally with synthetic weighted DAGs and real-world network embeddings; in both cases, the NSTs achieve lower embedding distortions than their counterparts using fixed spacetime geometries. ... We experimentally validate our model on synthetic metric DAG datasets, as well as real-world directed graphs that involve web hyperlink connections and gene expression networks, respectively. ... Section 4 EXPERIMENTAL RESULTS
Researcher Affiliation	Collaboration	Haitz S aez de Oc ariz Borde University of Oxford Anastasis Kratsios Mc Master University & Vector Institute Marc T. Law NVIDIA Xiaowen Dong University of Oxford Michael Bronstein University of Oxford & AITHYRA
Pseudocode	Yes	Algorithm 1: Neural (quasi-)metric, D Algorithm 2: Neural Partial Order, T Algorithm 3: Neural Spacetime, S = (E, D, T ) (Forward Pass)
Open Source Code	No	The paper does not explicitly provide an open-source code link or state that code is available in supplementary materials. It mentions using 'Network X library' but not its own code.
Open Datasets	Yes	We test our approach on real-world networks. In Table 2, we present results for the Cornell, Texas, and Wisconsin (Web KB) datasets (Rozemberczki et al., 2021)... We also work with real-world gene regulatory network datasets (Marbach et al., 2012)... We conduct an additional experiment on the ogbn-arxiv dataset (Hu et al., 2021)
Dataset Splits	Yes	We reimplement all baselines and test them on the Cornell, Texas, and Wisconsin datasets using 10-fold splits, based on the masks provided in Py Torch Geometric.
Hardware Specification	No	The paper mentions general computing environments for experiments (e.g., 'on a GPU', 'on a server') but does not specify any particular hardware details such as exact GPU or CPU models, memory, or cloud instance types.
Software Dependencies	No	The paper mentions 'Network X library', 'Adam W optimizer', and 'Py Torch Geometric' but does not specify any version numbers for these software components.
Experiment Setup	Yes	We employ a batch size of 10,000 to learn the distances, train for 10 epochs with a learning rate of 3x10^-3 and Adam W optimizer, and apply a max gradient norm of 1. All encoders have a total of 10 hidden layers with 100 neurons and a final projection layer to the embedding dimension. The neural (quasi-)metric has a total of 4 layers, with a hidden layer dimension equal to the event embedding dimensions, that is, either 2 or 4, and the last layer projects the representation to a scalar, i.e., the predicted distance. ... We train for 5,000 epochs with a learning rate of 10^-4 using the Adam W optimizer, and apply a max gradient norm of 1.