reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Sums of Independent Random Variables with Sparse Collective Support

Authors: Anindya De, Philip M. Long, Rocco A. Servedio

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports... We give two main algorithmic results for learning such distributions... We prove an essentially matching lower bound... Our algorithms and lower bounds together settle the question of how the sample complexity of learning sums of independent integer random variables scales with the elements in the union of their supports... Keywords: Central limit theorem, sample complexity, sums of independent random variables, equidistribution, semi-agnostic learning.
Researcher Affiliation	Collaboration	Anindya De EMAIL DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE UNIVERSITY OF PENNSYLVANIA... Philip M. Long EMAIL GOOGLE... Rocco A. Servedio EMAIL DEPARTMENT OF COMPUTER SCIENCE COLUMBIA UNIVERSITY...
Pseudocode	No	The paper describes algorithms conceptually in text, for example in Section 8 titled "The learning result: Learning when \|A\| ≤ 4", where it details steps like "First main step of the algorithm", "Second main step", and "Third main step". However, it does not present these algorithms in structured pseudocode blocks or figures explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper states: "c 2020 Anindya De, Philip M. Long and Rocco A. Servedio. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-531.html." This refers to the license of the paper itself and does not provide any information about the availability of source code for the described methodology.
Open Datasets	No	The paper focuses on theoretical learnability of sums of independent random variables and describes abstract distributions, mentioning that "training data is generated from a distribution that is only c̣-close to some A-sum". There is no mention of real-world or publicly available datasets used for empirical validation, nor are there any links, DOIs, or citations to data repositories.
Dataset Splits	No	The paper is theoretical and does not involve empirical experiments using real-world datasets. Therefore, there is no discussion or specification of training, test, or validation dataset splits.
Hardware Specification	No	The paper is theoretical, analyzing algorithmic results and lower bounds for learning distributions in terms of computational complexity (e.g., 'poly(1/̣) time'). It does not describe any empirical experiments or mention specific hardware components such as GPUs, CPUs, or TPUs used for computations.
Software Dependencies	No	The paper is theoretical and focuses on mathematical concepts, algorithms, and proofs without describing any practical implementation details. Consequently, it does not list any software dependencies, libraries, or specific version numbers.
Experiment Setup	No	The paper is theoretical and does not present empirical experiments. Therefore, it does not provide details on experimental setup, including hyperparameter values, model initialization, optimization settings, or training schedules.