Learning Sums of Independent Random Variables with Sparse Collective Support

Authors: Anindya De, Philip M. Long, Rocco A. Servedio

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports... We give two main algorithmic results for learning such distributions... We prove an essentially matching lower bound... Our algorithms and lower bounds together settle the question of how the sample complexity of learning sums of independent integer random variables scales with the elements in the union of their supports... Keywords: Central limit theorem, sample complexity, sums of independent random variables, equidistribution, semi-agnostic learning.
Researcher Affiliation Collaboration Anindya De EMAIL DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE UNIVERSITY OF PENNSYLVANIA... Philip M. Long EMAIL GOOGLE... Rocco A. Servedio EMAIL DEPARTMENT OF COMPUTER SCIENCE COLUMBIA UNIVERSITY...
Pseudocode No The paper describes algorithms conceptually in text, for example in Section 8 titled "The learning result: Learning when |A| ≤ 4", where it details steps like "First main step of the algorithm", "Second main step", and "Third main step". However, it does not present these algorithms in structured pseudocode blocks or figures explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper states: "c 2020 Anindya De, Philip M. Long and Rocco A. Servedio. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-531.html." This refers to the license of the paper itself and does not provide any information about the availability of source code for the described methodology.
Open Datasets No The paper focuses on theoretical learnability of sums of independent random variables and describes abstract distributions, mentioning that "training data is generated from a distribution that is only c̣-close to some A-sum". There is no mention of real-world or publicly available datasets used for empirical validation, nor are there any links, DOIs, or citations to data repositories.
Dataset Splits No The paper is theoretical and does not involve empirical experiments using real-world datasets. Therefore, there is no discussion or specification of training, test, or validation dataset splits.
Hardware Specification No The paper is theoretical, analyzing algorithmic results and lower bounds for learning distributions in terms of computational complexity (e.g., 'poly(1/̣) time'). It does not describe any empirical experiments or mention specific hardware components such as GPUs, CPUs, or TPUs used for computations.
Software Dependencies No The paper is theoretical and focuses on mathematical concepts, algorithms, and proofs without describing any practical implementation details. Consequently, it does not list any software dependencies, libraries, or specific version numbers.
Experiment Setup No The paper is theoretical and does not present empirical experiments. Therefore, it does not provide details on experimental setup, including hyperparameter values, model initialization, optimization settings, or training schedules.