reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Weisfeiler and Leman Go Gambling: Why Expressive Lottery Tickets Win

Authors: Lorenz Kummer, Samir Moustafa, Anatol Ehrlich, Franka Bause, Nikolaus Suess, Wilfried N. Gansterer, Nils Morten Kriege

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our work contributes to closing this gap by providing both formal and empirical evidence that preserving expressivity in sparsely initialized GNNs is crucial for finding winning tickets. We structure our experiments to address the primary research question, which also drove our theoretical analysis, namely how the pre-training expressivity of a lottery ticket affects its post-training accuracy. We investigate this research question by utilizing 10 real-world datasets from the TUDataset repository (Morris et al., 2020).
Researcher Affiliation	Academia	1Faculty of Computer Science, University of Vienna, Vienna, Austria 2Doctoral School Computer Science, University of Vienna, Vienna, Austria 3Research Network Data Science, University of Vienna, Vienna, Austria. Correspondence to: Lorenz Kummer <EMAIL>.
Pseudocode	No	No specific pseudocode block or algorithm section is present. The methodology is described in prose and mathematical equations.
Open Source Code	Yes	The code for reproducing our results is available at Git Hub: https://github.com/lorenz0890/wl2025lottery
Open Datasets	Yes	We investigate this research question by utilizing 10 real-world datasets from the TUDataset repository (Morris et al., 2020). These datasets, which are described in detail in Appendix B, are widely used in current studies.
Dataset Splits	No	No specific details about training, validation, and test splits (e.g., percentages, counts, or k-fold cross-validation) are explicitly mentioned in the paper for the datasets used. The paper states, "Results aggregated from training 13,500 runs over 10 datasets," but does not elaborate on the splitting strategy for individual datasets.
Hardware Specification	Yes	The experiments took approximately 8 weeks with three parallel workers to conclude and were conducted on a local server equipped with an NVIDIA H100 PCIe GPU (80GB VRAM), an Intel Xeon Gold 6326 CPU (500GB RAM) and a 1TB SSD.
Software Dependencies	No	No specific versions for software libraries such as PyTorch, TensorFlow, Python, or CUDA are mentioned. Only general components like 'ReLU activations' and 'Adam optimizer' are referred to without version numbers.
Experiment Setup	Yes	All models are trained for 250 epochs with a batch size of 32, a learning rate of 0.01, using the Adam optimizer. In line with LTH, only non-zero weights are updated. We use Re LU activations and initialize network parameters W(j) randomly from a uniform distribution U( q 1 mj ) with mj = \|I(j)\|, following common variance scaling initialization schemes (Glorot & Bengio, 2010; He et al., 2015).