Scikit-network: Graph Analysis in Python

Authors: Thomas Bonald, Nathan de Lara, Quentin Lutz, Bertrand Charpentier

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To show the performance of scikit-network, we compare the implementation of some representative algorithms with those of the graph softwares of Table 1: the Louvain clustering algorithm (Blondel et al., 2008), Page Rank (Page et al., 1999), HITS (Kleinberg, 1999) and the spectral embedding (Belkin and Niyogi, 2003). For Page Rank, the number of iterations is set to 100 when possible (that is for all packages except i Graph). For Spectral, the dimension of the embedding space is set to 16. Table 2 gives the running times of these algorithms on the Orkut graph of Konect (Kunegis, 2013). The graph has 3,072,441 nodes and 117,184,899 edges. The computer has a Debian 10 OS and is equipped with an AMD Ryzen Threadripper 1950X 16-Core Processor and 32 GB of RAM. As we can see, scikit-network is highly competitive. We also give in Table 3 the memory usage of each package when loading the graph. Thanks to the CSR format, scikit-network has a minimal footprint.
Researcher Affiliation Academia Thomas Bonald EMAIL Nathan de Lara EMAIL Quentin Lutz EMAIL Télécom Paris Institut Polytechnique de Paris 91120 Palaiseau France Bertrand Charpentier EMAIL Technical University of Munich D-80333 Munich Germany
Pseudocode No The paper describes the `scikit-network` package and its features, discussing various algorithms (ranking, clustering, etc.) that are implemented within the package. It does not provide any structured pseudocode or algorithm blocks for these methods, but rather refers to existing algorithms by name.
Open Source Code Yes Source code, documentation and installation instructions are available online1. [...] Open-source software. The package is hosted on Git Hub2 and part of Sci Py kits aimed at creating open-source scientific software. Its BSD license enables maximum interoperability with other software. Guidelines for contributing are described in the package s documentation3 and guidance is provided by the Git Hub-hosted Wiki. 1See https://scikit-network.readthedocs.io/en/latest/. 2See https://github.com/sknetwork-team/scikit-network.
Open Datasets Yes Table 2 gives the running times of these algorithms on the Orkut graph of Konect (Kunegis, 2013).
Dataset Splits No The paper evaluates the performance of graph analysis algorithms (Louvain, Page Rank, HITS, Spectral) on the Orkut graph. It does not describe any machine learning model training that would require explicit training/test/validation dataset splits, but rather benchmarks existing algorithms on a single, large graph.
Hardware Specification Yes The computer has a Debian 10 OS and is equipped with an AMD Ryzen Threadripper 1950X 16-Core Processor and 32 GB of RAM.
Software Dependencies No The package is distributed under the BSD license, with dependencies limited to Num Py and Sci Py. It is compatible with Python 3.6 and newer. [...] Scikit-network relies on a very limited number of external dependencies for ease of installation and maintenance. Only Sci Py and Num Py are required on the user side. [...] In order to speed up execution times, Cython (Behnel et al., 2011) generates C++ files automatically using a Python-like syntax. The paper mentions Python 3.6+, and NumPy, SciPy, and Cython as dependencies, but does not provide specific version numbers for these libraries used in the experiments.
Experiment Setup Yes For Page Rank, the number of iterations is set to 100 when possible (that is for all packages except i Graph). For Spectral, the dimension of the embedding space is set to 16.