reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ParFam -- (Neural Guided) Symbolic Regression via Continuous Global Optimization

Authors: Philipp Scholl, Katharina Bieker, Hillary Hauger, Gitta Kutyniok

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically analyze the expressivity of Par Fam and demonstrate its performance with extensive numerical experiments based on the common SR benchmark suit SRBench, showing that we achieve state-of-the-art results. Moreover, we present an extension incorporating a pre-trained transformer network (DL-Par Fam) to guide Par Fam, accelerating the optimization process by up to two magnitudes. Our code and results can be found at https://github.com/Philipp238/parfam.
Researcher Affiliation	Academia	Philipp Scholl, Katharina Bieker, Hillary Hauger & Gitta Kutyniok Mathematical Institute LMU Munich Munich, Germany EMAIL P. S. and G. K. are also affiliated with the Munich Center for Machine Learning (MCML), Germany. G. K. is further affiliated with the German Aerospace Center (DLR), Institute of Robotics and Mechatronics, and University of Tromsø, Norway.
Pseudocode	Yes	Algorithm 1: Traversal of the model parameters Input: Maximal Degree Input Numerator d1 max,in, Maximal Degree Output Numerator d1 max,out, Maximal Degree Input Denominator d2 max,in, Maximal Degree Output Denominator d2 max,out, Maximal number of base functions bmax Set of base functions Gmax = {g1, . . . , gk}. Output: List of model parameters L that define the models Par Fam can iterate through.
Open Source Code	Yes	Our code and results can be found at https://github.com/Philipp238/parfam.
Open Datasets	Yes	After the introduction of the SR benchmark (SRBench) by La Cava et al. (2021), several researchers have reported their findings on SRBench s ground-truth and black box data sets, due to the usage of real-world equations and data, the variety of formulas, the size of the benchmark, and comparability to competitors. The ground-truth datasets are synthetic datasets following real physical formulas and the black-box datasets are real-world datasets for which the underlying formulas are unknown. These data sets are described in more detail in Appendix F.
Dataset Splits	Yes	To ensure comparability with the results evaluated on SRBench, we use the same evaluation metrics as La Cava et al. (2021). For the ground-truth problems, we first report the symbolic solution rate, which is the percentage of equations recovered by an algorithm. Second, we consider the coefficient of determination R2 = 1 PN i=1(yi ˆyi)2 PN i=1(yi y)2 , (5) where ˆyi = fθ(xi) represents the model s prediction and y the mean of the output data y. The closer R2 is to 1, the better the model describes the variation in the data. It is a widely used measure for goodness-of-fit since it is independent of the scale and variation of the data. Following La Cava et al. (2021) we report the accuracy solution rate for each algorithm, defined as the percentage of functions such that R2 > 0.999. The original data sets do not include any noise. However, similar to La Cava et al. (2021), we additionally perform experiments with noise by adding εi N(0, σ2 1/N PN i=1 y2 i ) to the targets yi, where σ denotes the noise level.
Hardware Specification	Yes	The creation of the data took 1.3h and the training took 93h on one RTX3090. We further fine-tuned the network on noisy data on 4,000,000 noisy functions for 30h on one RTX3090. For comparison, end-to-end (E2E) (Kamienny et al., 2022) made use of 800 GPUh and Ne Sym Res (Biggio et al., 2021), which is only trained for problems up to dimension 3, used 10,000,000 functions and was trained on a Ge Force RTX 2080 GPU for 72h.
Software Dependencies	No	Our experiments use the Py SR package version 0.18.1 (Apache license 2.0). Our experiments use the official DSO github repository (BSD 3-Clause License). Our experiments use the official github repository (Apache 2.0 license). Our experiments use the official github repository (MIT license). Our experiments use the code from the official EQL github repository (GNU General Public License v3.0).
Experiment Setup	Yes	The first subset defines the parametric family (fθ)θ Rm, e.g., the degree of the polynomials and the set of base functions. A good choice for this set is highly problem-dependent. However, in the absence of prior knowledge, it is advantageous to select a parametric family that is sufficiently expansive to encompass a wide range of potential functions. In this context, we opt for sin, exp, and as our base functions. For the first layer rationals Q1, . . . , Qk, we set the degrees of the numerator and denominator polynomials to 2. For Qk+1, we set the degree of the numerator polynomial to 4 and the denominator polynomial to 3. This choice results in a parametric family with hundreds of parameters, making it challenging for global optimization. To address this issue, we iterate for Par Fam through various smaller parametric families, each contained in this larger family, see Appendix H for details. The second set of hyperparameters defines the optimization scheme. Here, we set the regularization parameter to λ = 0.001, the number of iterations for basin-hopping to 10, and the maximal number of BFGS steps for the local search to 100 times the dimension of the problem. Our choice of parameters for Par Fam and DL-Par Fam are summarized in Table 7 in Appendix I. The hyperparameters for the pre-training of DL-Par Fam can be found in Appendix D.