Towards a Unified and Verified Understanding of Group-Operation Networks

Authors: Wilson Wu, Louis Jaburi, jacob drori, Jason Gross

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train 100 one-hidden-layer neural network models from random initialization on the group S5. We then compute lower bounds on accuracy obtained by brute force, the cosets explanation, and the ρ-sets explation, which we refer to as Vbrute, Vcoset, Virrep respectively. We evaluate these lower bounds on both their runtime (compactness) and by the tightness of the bound. See Appendix I for full experiment details.
Researcher Affiliation Collaboration 1 University of Colorado Boulder 2 Independent EMAIL EMAIL
Pseudocode No The paper describes algorithms conceptually and discusses verifier programs, but it does not contain any clearly labeled pseudocode blocks or structured algorithm listings.
Open Source Code Yes Code for reproducing our experiments can be found at https://anonymous.4open.science/r/groups-E024.
Open Datasets No The paper uses mathematical group structures (e.g., symmetric group S5) as its 'data' for training models on binary operations. While these groups are mathematically defined and accessible, the paper does not provide concrete access information (link, DOI, repository) for a 'dataset' in the conventional sense of a file or collection of samples.
Dataset Splits Yes The test set is all pairs of two inputs from S5, with |S5|2 = 14400 points total. The training set comprises iid samples from the test set and has size 40% of the test set.
Hardware Specification Yes All models were trained on one Nvidia A6000 GPU.
Software Dependencies No Neural networks were implemented in Py Torch (Paszke et al., 2019). Their group-theoretic properties were analyzed with GAP (GAP, 2024). Specific version numbers for PyTorch and GAP are not provided.
Experiment Setup Yes For the main text, we train 100 one-hidden-layer models with hidden dimensionality m = 128 on the group S5. ... Each model was trained over 25000 epochs. Learning rate was set to 1e-2. We use the Adam optimizer (Kingma & Ba, 2015) with weight decay 2e-4 and (β1, β2) = (0.9, 0.98).