reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generative Feature Training of Thin 2-Layer Networks

Authors: Johannes Hertrich, Sebastian Neumayer

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our approach by numerical examples. First, we visually inspect the obtained features. Here, we also check if they recover the correct subspaces. Secondly, we benchmark our methods on common test functions from approximation theory, i.e., with a known groundtruth. Lastly, we target regression on some datasets from the UCI database (Kelly et al., 2023).
Researcher Affiliation	Academia	Johannes Hertrich EMAIL Université Paris Dauphine PSL Sebastian Neumayer EMAIL Technische Universität Chemnitz
Pseudocode	Yes	Algorithm 1 GFT and GFT-r training procedures. 1: Given: data (xk, yk)M k=1, architecture fw,b as in (1), generator Gθ, latent distribution η 2: while training Gθ do 3: sample N latent zl η and set w = Gθ(z) 4: compute optimal b(w) and wb(w) based on (6) 5: compute θL(θ) or θLreg(θ) with automatic differentiation 6: perform Adam update for θ 7: if GFT-r then 8: while refining w do 9: set w = Gθ(z) 10: compute optimal b(w) and wb(w) based on (6) 11: compute z F(z) or z Freg(z) with automatic differentiation 12: perform Adam update for z 13: Output: features w and optimal weights b(w)
Open Source Code	Yes	Our Py Torch implementation is available online1. We run all experiments on a NVIDIA RTX 4090 GPU. [...] The Py Torch implementation corresponding to our experiments is available at https://github.com/johertrich/generative_feature_ training.
Open Datasets	Yes	Next, we apply our method for regression on several UCI datasets Kelly et al. (2023). For this, we do not have an underlying ground truth function f. Here, we compare our methods with standard gradient-based neural network training, SHRIMP and SALSA.
Dataset Splits	Yes	To pick the regularization strength λ, we divide the original training data into a training (90%) and a validation (10%) set.
Hardware Specification	Yes	Our Py Torch implementation is available online1. We run all experiments on a NVIDIA RTX 4090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch implementation' and 'Adam optimizer' but does not specify version numbers for these software components. For example, it states: 'This procedure is implemented in many automatic differentiation packages such as Py Torch', and 'We optimize the loss functions for GFT and for the feature refinement with the Adam optimizer'.
Experiment Setup	Yes	We optimize the loss functions for GFT and for the feature refinement with the Adam optimizer using a learning rate of 1 10 4 for 40000 steps. The regularization ϵ for solving the least squares problem (6) is set to ϵ = 1 10 7. For the neural network optimization, we use the Adam optimizer with a learning rate of 1 10 3 for 100000 steps. In all cases, we discretize the spatial integral for the regularization term in (10) by 1000 samples. For the kernel ridge regression, we use a Gauss kernel with its parameter chosen by the median rule. That is, we set it to the median distance of two points in the dataset. [...] Further, we choose the generator Gθ for the proposal distribution pw = Gθ#N(0, Id) as Re LU network with 3 hidden layers and 512 neurons per hidden layer.