reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Theoretical Framework For Overfitting In Energy-based Modeling

Authors: Giovanni Catania, Aurélien Decelle, Cyril Furtlehner, Beatriz Seoane

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This work develops a theoretical framework for understanding and mitigating overfitting in EBMs. We begin with a simple Gaussian model as a fundamental non-trivial example, using it to quantitatively analyze overfitting through synthetic experiments with predefined ground truths. We examine eigenvalue dynamics using artificial covariance matrices that simulate real datasets, exploring how overfitting arises from different learning timescales associated with various eigenmodes of the empirical covariance matrix. We address inaccuracies in learned eigenvalues with corrections based on random matrix theory (RMT)...
Researcher Affiliation	Academia	1Departamento de F ısica Te orica, Universidad Complutense de Madrid, Spain. 2Escuela T ecnica Superior de Ingenieros Industriales, Universidad Polit ecnica de Madrid, Spain 3Inria-Saclay, Universit e Paris-Saclay, LISN, Gif-sur-Yvette, France.
Pseudocode	No	The paper contains mathematical equations and descriptions of methods, but no explicit sections or figures labeled as "Pseudocode" or "Algorithm" with structured, code-like steps.
Open Source Code	No	The paper does not contain any explicit statements about releasing code, nor does it provide links to code repositories or mention code in supplementary materials.
Open Datasets	Yes	Figure 1. (a): Eigenvalue spectra of the empirical covariance matrices for MNIST dataset (Deng, 2012)... We illustrate how the principal components control a timescale separation, where information progressively encoded from the strongest to the weakest data modes... The spectra for CIFAR-10 (Krizhevsky et al., 2009) and the Human Genome Dataset (Consortium et al., 2015) are displayed in (a) and (b), respectively.
Dataset Splits	No	The paper discusses generating data points with finite 'M' samples for training and evaluates 'Etrain' and 'Etest' or 'LLtrain,test', implying a distinction between training and testing data. However, it does not provide specific details on how empirical datasets were split into training, validation, or test sets with percentages, absolute counts, or predefined split methodologies, which is required for reproducing data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory amounts, or cloud instances) used for running its experiments.
Software Dependencies	No	The paper describes mathematical and algorithmic methodologies but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their specific versions) used for implementation or experimentation.
Experiment Setup	Yes	Jt+1 ij = Jt ij + γ L Jij where γ is the learning rate... In all cases the initial condition is an identity matrix. (Figure 2)... The learning rate is set to γ = 10 3. (Appendix C)... The learning rate is set to γ = 10 2. (Appendix I)... starting from the same initial condition (Jα(0) = 1)... Starting from an initial condition J(0) that does not commute with C...