reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Hierarchical Polynomials of Multiple Nonlinear Features

Authors: Hengyu Fu, Zihao Wang, Eshaan Nichani, Jason Lee

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A NUMERICAL EXPERIMENTS We empirically verify Theorem 1 and Proposition 1. [...] The left panel of Figure 2 demonstrates that our model outperforms the naive random-feature model across all dimensions.
Researcher Affiliation	Academia	Peking University. Email: EMAIL Stanford University. Email: EMAIL Princeton University. Email: EMAIL
Pseudocode	Yes	Algorithm 1 Layer-wise training algorithm
Open Source Code	No	No explicit statement about code availability or a repository link was found in the paper.
Open Datasets	No	Data distribution Our aim is to learn the target function f : X R, with X Rd being the input space. Throughout the paper, we assume X = Sd 1(d), that is, the sphere with radius d in d dimensions. Also, we consider the data distribution to be the uniform distribution on the sphere, i.e., x Unif(X), and we draw two independent datasets D1, D2, each with n1 and n2 i.i.d. samples, respectively.
Dataset Splits	Yes	Data distribution [...] we draw two independent datasets D1, D2, each with n1 and n2 i.i.d. samples, respectively. Thus, we draw n1 + n2 samples in total. [...] Training Algorithm Following Nichani et al. (2023), our network is trained via layer-wise gradient descent with sample splitting. [...] Algorithm 1 Layer-wise training algorithm Input: Learning rates η1, η2, weight decay λ1, λ2, parameter ϵ, number of steps T [...] 2 train W on dataset D1 [...] 8 train a on dataset D2
Hardware Specification	No	No specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) were found in the paper's experimental setup or any other section.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library or solver names with versions) were found in the paper.
Experiment Setup	Yes	Input: Learning rates η1, η2, weight decay λ1, λ2, parameter ϵ, number of steps T [...] We initialize each row of V to be drawn uniformly on the sphere of radius d [...] For the network architecture, we choose σ1 as per (2) and σ2 = Q2, with network sizes set to m1 = 10000 and m2 = 20000. [...] For the right panel, we conduct transfer learning with n1 = 216 pretraining samples and plot the dependence on n2. The figure reports the mean and normalized standard error of the test error using 10,000 fresh samples, based on 5 independent experimental instances.