reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Machine learning detects terminal singularities

Authors: Tom Coates, Alexander Kasprzyk, Sara Veneziale

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper we demonstrate that machine learning can be used to understand this classiﬁcation. We focus on eight-dimensional positively-curved algebraic varieties that have toric symmetry and Picard rank two, and develop a neural network classiﬁer that predicts with 95% accuracy whether or not such an algebraic variety is Q-Fano. We trained a feed-forward neural network classiﬁer on a balanced dataset of 5 million examples; these are eight-dimensional Q-factorial Fano toric varieties of Picard rank two, of which 2.5 million are terminal and 2.5 million non-terminal. Testing on a further balanced dataset of 5 million examples showed that the neural network classiﬁes such toric varieties as terminal or non-terminal with an accuracy of 95%.
Researcher Affiliation	Academia	Tom Coates Department of Mathematics Imperial College London 180 Queen s Gate London, SW7 2AZ UK EMAIL Alexander M. Kasprzyk School of Mathematical Sciences University of Nottingham Nottingham, NG7 2RD UK EMAIL Sara Veneziale Department of Mathematics Imperial College London 180 Queen s Gate London, SW7 2AZ UK EMAIL
Pseudocode	Yes	Algorithm 1 Test terminality for weight matrix W = [[a1, . . . , a N], [b1, . . . , b N]].
Open Source Code	Yes	All code used and trained models are available from Bit Bucket under an MIT licence [12]. Supporting code. https://bitbucket.org/ fanosearch/ml_terminality, 2023.
Open Datasets	Yes	The datasets underlying this work and the code used to generate them are available from Zenodo under a CC0 license [11]. A dataset of 8-dimensional Q-factorial Fano toric varieties of Picard rank 2. Zenodo, 2023. doi:10.5281/zenodo.10046893.
Dataset Splits	Yes	We tested the model on a balanced subset of 50% of the data (5M); the remainder was used for training (40%; 4M balanced) and validation (10%; 1M).
Hardware Specification	No	The paper mentions "30 CPU years" for data generation and "120 CPU hours" for ML-assisted generation, but it does not provide specific hardware details such as CPU models, GPU models, or memory specifications.
Software Dependencies	Yes	Data generation and post-processing was carried out using the computational algebra system Magma V2.27-3 [5]. The machine learning model was built using Py Torch v1.13.1 [36] and scikit-learn v1.1.3 [37]. Hyperparameter tuning was partly carried out using Ray Tune [31].
Experiment Setup	Yes	The final best network conﬁguration is summarised in Table 1. Table 1: Final network architecture and conﬁguration. Hyperparameter Value Hyperparameter Value Layers (512, 768, 512) Momentum 0.99 Batch size 128 Leaky Relu slope 0.01 Initial learning rate 0.01