Learning Sums of Independent Random Variables with Sparse Collective Support
Authors: Anindya De, Philip M. Long, Rocco A. Servedio
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports... We give two main algorithmic results for learning such distributions... We prove an essentially matching lower bound... Our algorithms and lower bounds together settle the question of how the sample complexity of learning sums of independent integer random variables scales with the elements in the union of their supports... Keywords: Central limit theorem, sample complexity, sums of independent random variables, equidistribution, semi-agnostic learning. |
| Researcher Affiliation | Collaboration | Anindya De EMAIL DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE UNIVERSITY OF PENNSYLVANIA... Philip M. Long EMAIL GOOGLE... Rocco A. Servedio EMAIL DEPARTMENT OF COMPUTER SCIENCE COLUMBIA UNIVERSITY... |
| Pseudocode | No | The paper describes algorithms conceptually in text, for example in Section 8 titled "The learning result: Learning when |A| ≤ 4", where it details steps like "First main step of the algorithm", "Second main step", and "Third main step". However, it does not present these algorithms in structured pseudocode blocks or figures explicitly labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper states: "c 2020 Anindya De, Philip M. Long and Rocco A. Servedio. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-531.html." This refers to the license of the paper itself and does not provide any information about the availability of source code for the described methodology. |
| Open Datasets | No | The paper focuses on theoretical learnability of sums of independent random variables and describes abstract distributions, mentioning that "training data is generated from a distribution that is only c̣-close to some A-sum". There is no mention of real-world or publicly available datasets used for empirical validation, nor are there any links, DOIs, or citations to data repositories. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments using real-world datasets. Therefore, there is no discussion or specification of training, test, or validation dataset splits. |
| Hardware Specification | No | The paper is theoretical, analyzing algorithmic results and lower bounds for learning distributions in terms of computational complexity (e.g., 'poly(1/̣) time'). It does not describe any empirical experiments or mention specific hardware components such as GPUs, CPUs, or TPUs used for computations. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical concepts, algorithms, and proofs without describing any practical implementation details. Consequently, it does not list any software dependencies, libraries, or specific version numbers. |
| Experiment Setup | No | The paper is theoretical and does not present empirical experiments. Therefore, it does not provide details on experimental setup, including hyperparameter values, model initialization, optimization settings, or training schedules. |