reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime

Authors: Niladri S. Chatterji, Philip M. Long

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Simulations We experimentally study the behavior of the maximum margin classiﬁer in the overparameterized regime on synthetic data generated according to the Boolean noisy rare-weak model. Recall that this is a model where the clean label y { 1, 1} is ﬁrst generated by ﬂipping a fair coin. Then the covariate x is drawn from a distribution conditioned on y such that s of the coordinates of x are equal to y with probability 1/2 + γ and the other p s coordinates are random and independent of the true label. The noisy label y is obtained by ﬂipping y with probability η. In this section the ﬂipping probability η is always 0.1. In all our experiments the number of samples n is kept constant at 100. In the ﬁrst experiment in Figure 2 we hold n and the number of relevant attributes s constant and vary the dimension p for diﬀerent values of γ. We ﬁnd that after an initial dip in the test error (for γ = 0.2, 0.3) the test error starts to rise slowly with p, as in our upper bounds. Next, in Figure 3 we explore the scaling of the test error with the number of relevant attributes s when n and p are held constant. As we would expect, the test error decreases as s grows for all the diﬀerent values of γ. Finally, in Figure 4 we study how the test error changes when both p and s are increasing when n and γ are held constant. Our results (see Corollary 3) do not guarantee learning when s = Θ( p) (and Jin (2009) proved that learning is impossible in a related setting, even in the absence of noise); we ﬁnd that the test error remains constant in our experiment in this setting. In the cases when s = p0.55 and when s = p0.65, slightly beyond this threshold, the test error approaches the Bayes-optimal error as p gets large in our experiment.
Researcher Affiliation	Collaboration	Niladri S. Chatterji EMAIL University of California, Berkeley, 366 Le Conte Hall, Berkeley, CA 94720 Philip M. Long EMAIL Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043
Pseudocode	No	The paper includes mathematical proofs, lemmas, and theoretical derivations (e.g., Lemma 4, Lemma 6, Lemma 7, Lemma 8, Lemma 11, Lemma 22), but it does not contain any explicitly labeled pseudocode or algorithm blocks. The
Open Source Code	No	The paper states that "a preliminary version of this paper was posted on arXiv (Chatterji and Long, 2020)" which is a preprint server, but it does not provide an explicit statement about code release for the methodology described in the paper, nor does it include any links to code repositories.
Open Datasets	No	The paper uses "synthetic data generated according to the Boolean noisy rare-weak model" for its simulations. It describes the data generation process in detail but does not provide access information (link, DOI, repository) to an existing public dataset or to the specific synthetic data generated for the experiments. Therefore, it does not provide concrete access information for a publicly available or open dataset.
Dataset Splits	No	In the "Simulations" section, the paper mentions that "The number of samples n = 100 is kept ﬁxed" and that "The plot is generated by averaging over 500 draws of the samples. The train error on all runs was always 0." While it mentions 'test error' and 'train error,' it does not specify how the initial n=100 samples are split into training, validation, or test sets, or any specific methodology for partitioning the data, beyond implying that the 'test error' is evaluated on separate data from the training process.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the simulations or experiments described in Section 5.
Software Dependencies	No	The paper does not specify any software dependencies, libraries, or programming languages with version numbers that would be needed to reproduce the experiments.
Experiment Setup	No	The "Simulations" section describes the parameters used for generating the synthetic data (e.g., "flipping probability η is always 0.1", "number of samples n = 100", "number of relevant variables s = 50", varying "dimension p in the interval [100, 3000]", different values of "γ"). However, it does not provide details about the experimental setup for training the maximum margin classifier itself, such as specific hyperparameters (e.g., learning rate, number of iterations) for the gradient descent process mentioned in the theoretical analysis.