reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SLRL: Semi-Supervised Local Community Detection Based on Reinforcement Learning

Authors: Li Ni, Rui Ye, Wenjian Luo, Yiwen Zhang, Lei Zhang, Victor S. Sheng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that SLRL outperforms state-of-the-art algorithms on five real-world datasets.
Researcher Affiliation	Academia	Li Ni1,2, Rui Ye1, Wenjian Luo3, Yiwen Zhang1*, Lei Zhang1, Victor S. Sheng4 1School of Computer Science and Technology, Anhui University, China 2Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, China 3School of Computer Science and Technology, Harbin Institute of Technology, China 4School of Department of Computer Science, Texas Tech University, USA
Pseudocode	Yes	Algorithm 1: Optimization Input: G, Ck 1: Initialize agent with parameters Q. 2: for epochs do 3: Generate a training sample set D. 4: for Each sample in D do 5: Obtain the trajectory τ = (S0, a1, r1, ..., ST ). 6: Compute the cumulative reward Gt for each time. 7: end for 8: Update Q based on Equation (8). 9: Teacher-force π using MLE on Ck. 10: end for
Open Source Code	Yes	In this section, we conduct experiments to evaluate the performance of SLRL, and the code of SLRL is available1. 1https://github.com/nilics/AAAI2025-SLRL.git
Open Datasets	Yes	Dataset: We utilize five real-world datasets, including Amazon (Yang and Leskovec 2012), DBLP (Backstrom et al. 2006), Live Journal (Backstrom et al. 2006), Youtube (Mislove et al. 2007), and Twitter (Mc Auley and Leskovec 2012).
Dataset Splits	Yes	For each dataset, we select given nodes as follows: 1000 nodes from Amazon with an interval of 13 (selecting one node for every 13), 1000 nodes from DBLP with an interval of 114, 500 nodes from Live Journal with an interval of 633, 500 nodes from Youtube with an interval of 433, and 500 nodes from Twitter with an interval of 41. In addition, the number of clusters is set to 2, which is determined to maximize the silhouette coefficients (Rousseeuw 1987). ... Two groups of ground-truth communities, each comprising 100 communities, are selected as the known communities for the experiments.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper refers to various methods and components like 'Graph Pointer Network with incremental (i GPN)', 'Multilayer Perceptron (MLP)', 'spectral clustering', 'K-medoids', and 'Hierarchical Agglomerative Clustering', but it does not specify any software library names along with their version numbers.
Experiment Setup	Yes	The number of communities generated by CLARE and SEAL is set to 5000 on the Amazon, DBLP, Live Journal, Youtube, and Twitter datasets. Similar to SEAL, SLRL terminates when the community size exceeds the maximum size of known communities. ... In addition, the number of clusters is set to 2, which is determined to maximize the silhouette coefficients (Rousseeuw 1987). ... During the experiments, we set γ to 1 for all datasets. ... SEAL, CLARE, Community AF, and SLRL are run three times for each group using different random numbers. The average metrics from these three runs are used as the metric for each group.