Towards a Unified and Verified Understanding of Group-Operation Networks
Authors: Wilson Wu, Louis Jaburi, jacob drori, Jason Gross
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train 100 one-hidden-layer neural network models from random initialization on the group S5. We then compute lower bounds on accuracy obtained by brute force, the cosets explanation, and the ρ-sets explation, which we refer to as Vbrute, Vcoset, Virrep respectively. We evaluate these lower bounds on both their runtime (compactness) and by the tightness of the bound. See Appendix I for full experiment details. |
| Researcher Affiliation | Collaboration | 1 University of Colorado Boulder 2 Independent EMAIL EMAIL |
| Pseudocode | No | The paper describes algorithms conceptually and discusses verifier programs, but it does not contain any clearly labeled pseudocode blocks or structured algorithm listings. |
| Open Source Code | Yes | Code for reproducing our experiments can be found at https://anonymous.4open.science/r/groups-E024. |
| Open Datasets | No | The paper uses mathematical group structures (e.g., symmetric group S5) as its 'data' for training models on binary operations. While these groups are mathematically defined and accessible, the paper does not provide concrete access information (link, DOI, repository) for a 'dataset' in the conventional sense of a file or collection of samples. |
| Dataset Splits | Yes | The test set is all pairs of two inputs from S5, with |S5|2 = 14400 points total. The training set comprises iid samples from the test set and has size 40% of the test set. |
| Hardware Specification | Yes | All models were trained on one Nvidia A6000 GPU. |
| Software Dependencies | No | Neural networks were implemented in Py Torch (Paszke et al., 2019). Their group-theoretic properties were analyzed with GAP (GAP, 2024). Specific version numbers for PyTorch and GAP are not provided. |
| Experiment Setup | Yes | For the main text, we train 100 one-hidden-layer models with hidden dimensionality m = 128 on the group S5. ... Each model was trained over 25000 epochs. Learning rate was set to 1e-2. We use the Adam optimizer (Kingma & Ba, 2015) with weight decay 2e-4 and (β1, β2) = (0.9, 0.98). |