Fairness-Aware Dense Subgraph Discovery

Authors: Emmanouil Kariotakis, Nicholas D Sidiropoulos, Aritra Konar

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world datasets demonstrate that our methods not only match but frequently outperform existing solutions, sometimes incurring even less than half the subgraph density loss compared to prior art, while achieving the target fairness levels. Importantly, they excel in scenarios that previous methods fail to adequately handle, i.e., those with extreme subgroup imbalances, highlighting their effectiveness in extracting fair and dense solutions. ... 7 Experimental Evaluation: We consider various real-world datasets, with Table 1 summarizing their statistics (for more information, see Appendix F). The implementation of our approach is publicly available on Github1.
Researcher Affiliation Academia Emmanouil Kariotakis EMAIL ESAT-STADIUS KU Leuven Nicholas D. Sidiropoulos EMAIL Department of Electrical Engineering University of Virginia Aritra Konar EMAIL ESAT-STADIUS KU Leuven
Pseudocode Yes Additionally, it is straightforward to implement it using the objective functions of FADSG-I and II (for more details, see [Algorithm 1, Appendix E]). ... Pseudo-code of the algorithm is provided in (Algorithm 1).
Open Source Code Yes The implementation of our approach is publicly available on Github1. 1https://github.com/ekariotakis/fadsg_tmlr2025
Open Datasets Yes We consider various real-world datasets, with Table 1 summarizing their statistics (for more information, see Appendix F). ... Political Books Dataset (Pol Books) ... Political Blogs Dataset (Pol Blogs) (Adamic & Glance, 2005) ... Amazon Products Metadata (Amazon) (Ni et al., 2019) ... Last FM Asia Social Network (Last FM) (Rozemberczki & Sarkar, 2020) ... Deezer Europe Social Network (Deezer) (Rozemberczki & Sarkar, 2020) ... Git Hub Developers (Git Hub) (Rozemberczki et al., 2019) ... Twitch Users Social Networks (Twitch) (Rozemberczki et al., 2019) ... Twitter Retweet Political Network (Twitter) (Rossi & Ahmed, 2015; Conover et al., 2011)
Dataset Splits No The paper lists multiple datasets and their summary statistics in Table 1 and Appendix F, but it does not provide specific details on how these datasets were split into training, validation, or test sets for the experiments.
Hardware Specification Yes All experiments were performed on a single machine, with Intel i7-13700K CPU @ 3.4GHz and 128GB of main memory running Python 3.12.0.
Software Dependencies Yes All experiments were performed on a single machine, with Intel i7-13700K CPU @ 3.4GHz and 128GB of main memory running Python 3.12.0.
Experiment Setup Yes In order to find the appropriate λ value for the desired target fairness level that we had for each experiment, we performed bisection on λ values. The lower bound was always 0 and the upper was λmax. The value of λmax was computed experimentally for each dataset. We specifically chose the value of λmax based on our observation that it led to the densest protected subgraph for FADSG-I and the entire protected subgraph for FADSG-II. ... Furthermore, this bisection was performed with a tolerance on λ, ε = 10-9. ... we determined that a value of T = 5 iterations provided satisfactory results across all our datasets in our implementation of Super-Greedy++.