reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Can Kernel Methods Explain How the Data Affects Neural Collapse?

Authors: Vignesh Kothapalli, Tom Tirer

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present empirical results on Gaussian data to show that the adaptivity of such kernels yields lower NC1 and allows them to reasonably approximate the NC1 behavior of shallow NNs for linearly separable datasets. We conduct experiments on datasets with varying sample sizes and input dimensions to verify our theoretical results and show that insights generalize (e.g., beyond d0 = 1).
Researcher Affiliation	Academia	Vignesh Kothapalli EMAIL Courant Institute of Mathematical Sciences New York University Tom Tirer EMAIL Faculty of Engineering Bar-Ilan University
Pseudocode	No	The paper describes methods and equations, such as the Equations of State for the data-aware GP Kernel in Definition 6.1 and its numerical solutions in Section 6.2, but it does not present these as structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at: https://github.com/kvignesh1420/shallow_nc1.
Open Datasets	Yes	For C = 2, a dataset size N chosen from {128, 256, 512, 1024}, and input dimension d0 chosen from {1, 2, 8, 32, 128}, we create the data vector and label pairs as follows: D1(N, d0) = (x1,i N( 2 1d0, 0.25 Id0), y1,i = 1), i [N/2]) (x2,j N(2 1d0, 0.25 Id0), y1,i = 1), j [N/2]) .
Dataset Splits	No	The paper describes the generation of synthetic datasets D1(N, d0) and D2(N, d0) with parameters for class distributions and sample sizes. For example, 'D1(N, d0) = (x1,i N( 2 1d0, 0.25 Id0), y1,i = 1), i [N/2]) (x2,j N(2 1d0, 0.25 Id0), y1,i = 1), j [N/2])'. However, it does not explicitly provide details about how these datasets are split into training, validation, or test sets for the experiments involving the 2L-FCN.
Hardware Specification	Yes	All the experiments in this paper were executed on a machine with 16 GB of host memory and 8 CPU cores.
Software Dependencies	No	The paper mentions using 'scipy.optimize.newton_krylov python API' for solving the Eo S, but it does not provide specific version numbers for Python, SciPy, or any other software libraries used in the experiments.
Experiment Setup	Yes	Setup: We train a 2L-FCN with d1 = 500, σw = 1, σb = 0 and Erf activation using (vanilla) Gradient Descent with a learning rate of 10 3 and weight-decay 10 6 for 1000 steps to reach the terminal phase of training... The Re LU activation experiments use a learning rate of 10 4.