Weighted Embeddings for Low-Dimensional Graph Representation
Authors: Thomas Bläsius, Jean-Pierre von der Heydt, Maximilian Katzmann, Nikolai Maas
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide the embedding algorithm WEMBED and demonstrate, based on generated as well as over 2000 real-world graphs, that our weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions. The running time of WEMBED and embedding quality for the remaining instances is on par with state-of-the-art Euclidean embedders. |
| Researcher Affiliation | Academia | Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany EMAIL |
| Pseudocode | No | The paper describes the WEMBED algorithm in paragraph text, detailing its steps and components such as weight estimation, gradient descent optimization, and the use of geometric data structures. However, it does not present this information within a clearly structured pseudocode block or algorithm figure. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code for the described methodology (WEMBED algorithm) or a direct link to a code repository. |
| Open Datasets | Yes | We use a set of real-world networks described by Bl asius and Fischbeck (2024) restricted to graphs with n 10 k nodes and m 50 k edges, which amounts to 2173 graphs. We refer to this data set as real. |
| Dataset Splits | No | The paper evaluates embedding quality based on reconstruction accuracy and calculates F1-scores. It mentions sampling '10 m non-edges' for approximating F1-scores on large graphs, but it does not specify explicit training, validation, or test splits for the datasets in the traditional sense for supervised learning tasks or for the embedding process itself. |
| Hardware Specification | Yes | All experiments were run on a machine with two AMD EPYC Zen2 7742 (64 cores) with 2.25-3.35 GHz frequency, 1024GB RAM and 256MB L3 cache. To account for randomness, each data point is the average of five runs. To get comparable running times, computation is single-threaded. The only exception is the much slower POINCAR EEMBED, for which we compute only one embedding and use all 128 available cores. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' and 'R-tree implementation provided by Boost'. However, it does not specify version numbers for these software components or any other libraries used, which is required for a reproducible description of software dependencies. |
| Experiment Setup | Yes | We used the default parameters for DEEPWALK, sigmoid and negative sampling (option 6) for FORCE2VEC, and adjacency similarity (verse neigh) for VERSE. The default parameters for POINCAR EEMBED did not yield good results and we used lr = 0.3, epochs = 300, and batchsize = 50. All experiments were run on a machine with two AMD EPYC Zen2 7742 (64 cores) with 2.25-3.35 GHz frequency, 1024GB RAM and 256MB L3 cache. To account for randomness, each data point is the average of five runs. To get comparable running times, computation is single-threaded. The only exception is the much slower POINCAR EEMBED, for which we compute only one embedding and use all 128 available cores. |