reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Compression, Generalization and Learning

Authors: Marco C. Campi, Simone Garatti

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical demonstrations complement the theoretical study and show that the bounds cover tightly the actual stochastic dispersion of the risk. Example 20 We consider a sample of 1000 points drawn in an independent fashion in R3 and an algorithm A that constructs the corresponding convex hull (see Figure 5). The compression function c returns the vertexes of the convex hull (in case of multiple points corresponding to the same vertex, only one point is put in the compression) and a new point is appropriate if it belongs to the convex hull. It is easy to check that c satisﬁes the preference Property 2 and the non-associativity Property 5 and that coherence part I Property 9 and coherence part II Property 18 also hold. Hence, if the probability by which the points are drawn has no concentrated mass (for instance, if it admits density), then the non-concentrated mass Property 6 is also veriﬁed and Theorem 19 can be used to assess the risk. Panels (a) and (b) in Figure 6 proﬁle the region delimited by εk and εk for N = 1000 and δ = 10 3. The green dots are generated by a Monte-Carlo testing with (a) a Gaussian distribution and (b) a uniform distribution.
Researcher Affiliation	Academia	Marco C. Campi EMAIL Department of Information Engineering University of Brescia via Branze 38, 25123 Brescia, Italy Simone Garatti EMAIL Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano piazza L. da Vinci 32, 20133 Milano, Italy
Pseudocode	Yes	The algorithm formally described below. STEP 0. SET q := 0, P := S ms(( x, y)), C = and x C = x, y C = y; STEP 1. SET q := q + 1 and SOLVE problem ... STEP 4. IF either \|C\| = d or P = THEN STOP and RETURN ℓj, Rj, j = 1, . . . , q and C; ELSE, GO TO 1.
Open Source Code	No	The paper provides MATLAB code in Appendix B for the computation of εk and εk, which are derived bounds, not for the implementation of the learning methodologies (SVM, SVR, GEM) discussed as applications. There are no other explicit statements about source code availability for the described methodologies.
Open Datasets	No	Example 20 We consider a sample of 1000 points drawn in an independent fashion in R3 and an algorithm A that constructs the corresponding convex hull (see Figure 5). No concrete access information for a public dataset is provided.
Dataset Splits	No	No specific dataset split information is provided, as the empirical demonstrations in Example 20 use synthetic data ("a sample of 1000 points drawn in an independent fashion in R3") without detailing train/test/validation partitions.
Hardware Specification	No	The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions 'MATLAB code' in Appendix B but does not provide specific version numbers for MATLAB or any associated libraries.
Experiment Setup	No	The paper describes theoretical applications to learning schemes like SVM, SVR, and GEM, and presents a Monte-Carlo simulation in Example 20, but it does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training configurations for these methods.