Weighted Embeddings for Low-Dimensional Graph Representation

Authors: Thomas Bläsius, Jean-Pierre von der Heydt, Maximilian Katzmann, Nikolai Maas

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide the embedding algorithm WEMBED and demonstrate, based on generated as well as over 2000 real-world graphs, that our weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions. The running time of WEMBED and embedding quality for the remaining instances is on par with state-of-the-art Euclidean embedders.
Researcher Affiliation Academia Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany EMAIL
Pseudocode No The paper describes the WEMBED algorithm in paragraph text, detailing its steps and components such as weight estimation, gradient descent optimization, and the use of geometric data structures. However, it does not present this information within a clearly structured pseudocode block or algorithm figure.
Open Source Code No The paper does not contain an explicit statement about the release of source code for the described methodology (WEMBED algorithm) or a direct link to a code repository.
Open Datasets Yes We use a set of real-world networks described by Bl asius and Fischbeck (2024) restricted to graphs with n 10 k nodes and m 50 k edges, which amounts to 2173 graphs. We refer to this data set as real.
Dataset Splits No The paper evaluates embedding quality based on reconstruction accuracy and calculates F1-scores. It mentions sampling '10 m non-edges' for approximating F1-scores on large graphs, but it does not specify explicit training, validation, or test splits for the datasets in the traditional sense for supervised learning tasks or for the embedding process itself.
Hardware Specification Yes All experiments were run on a machine with two AMD EPYC Zen2 7742 (64 cores) with 2.25-3.35 GHz frequency, 1024GB RAM and 256MB L3 cache. To account for randomness, each data point is the average of five runs. To get comparable running times, computation is single-threaded. The only exception is the much slower POINCAR EEMBED, for which we compute only one embedding and use all 128 available cores.
Software Dependencies No The paper mentions using the 'Adam optimizer' and 'R-tree implementation provided by Boost'. However, it does not specify version numbers for these software components or any other libraries used, which is required for a reproducible description of software dependencies.
Experiment Setup Yes We used the default parameters for DEEPWALK, sigmoid and negative sampling (option 6) for FORCE2VEC, and adjacency similarity (verse neigh) for VERSE. The default parameters for POINCAR EEMBED did not yield good results and we used lr = 0.3, epochs = 300, and batchsize = 50. All experiments were run on a machine with two AMD EPYC Zen2 7742 (64 cores) with 2.25-3.35 GHz frequency, 1024GB RAM and 256MB L3 cache. To account for randomness, each data point is the average of five runs. To get comparable running times, computation is single-threaded. The only exception is the much slower POINCAR EEMBED, for which we compute only one embedding and use all 128 available cores.