Graph Cuts with Arbitrary Size Constraints Through Optimal Transport

Authors: Chakib Fettal, lazhar labiod, Mohamed Nadif

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We evaluated the clustering performance of our two variants OT-ncut and OT-rcut algorithms against the spectral clustering algorithm and state-of-the-art OT-based graph clustering approaches. 5.1 Datasets We perform experiments on graphs constructed from image datasets, namely, MNIST (Deng, 2012), Fashion MNIST (Xiao et al., 2017) and KMNIST (Clanuwat et al., 2018). ... Table 2: Average ( sd) clustering performance and running times on the graph built from images. ... Table 3: Average ( sd) clustering performance and running times on the graph datasets.
Researcher Affiliation Academia Chakib Fettal EMAIL CDC Informatique Centre Borelli, UMR 9010, Université Paris Cité Lazhar Labiod EMAIL Centre Borelli, UMR 9010, Université Paris Cité Mohamed Nadif EMAIL Centre Borelli, UMR 9010, Université Paris Cité
Pseudocode Yes Algorithm 1: Nonconvex Accelerated PGD for OT-cut
Open Source Code Yes For reproducibility purposes, we release our code1. 1https://github.com/chakib401/OT-cut
Open Datasets Yes We perform experiments on graphs constructed from image datasets, namely, MNIST (Deng, 2012), Fashion MNIST (Xiao et al., 2017) and KMNIST (Clanuwat et al., 2018). We also consider four graph datasets: DBLP, a co-term citation network; and ACM, a coauthor citation networks (Fan et al., 2020). EU-Email an email network from a large European research institution (Leskovec & Krevl, 2014). Indian-Village describes interactions among villagers in Indian villages (Banerjee et al., 2013). ... Results on long-tailed versions of CIFAR-10 are reported in table 5.
Dataset Splits No The paper uses various datasets (MNIST, Fashion-MNIST, KMNIST, DBLP, ACM, Village, EU-Email, CIFAR-10) for clustering experiments. However, it does not explicitly provide details about specific training, test, or validation dataset splits, percentages, or sample counts for these experiments, which are typically found in supervised learning contexts. For clustering, the entire dataset is generally used for the algorithm.
Hardware Specification Yes All experiments were run five times and were performed on a 64gb RAM machine with a 12th Gen Intel(R) Core(TM) i9-12950HX (24 CPUs) processor with a frequency of 2.3GHz.
Software Dependencies No Our two variants, OT-ncut and OT-rcut are implemented via the Python optimal transport package (POT) (Flamary et al., 2021). ... For the baselines, we use the Scikit-Learn (Pedregosa et al., 2011) implementation of spectral clustering. ... We use the implementations of Louvain, Fluid, and Greedy provided in the networkx package (Hagberg et al., 2008). We use the implementation of Infomap provided in Edler et al. (2024). The paper mentions software packages used but does not provide specific version numbers for them.
Experiment Setup Yes We use random initialization and use uniform target distributions unless explicitly stated otherwise. We also set α = 1/2 and the number of iterations to 30 for the image graphs and 20 for the other graphs. We also use normalized laplacian matrices.