Recursive Teaching Dimension, VC-Dimension and Sample Compression
Authors: Thorsten Doliwa, Gaojian Fan, Hans Ulrich Simon, Sandra Zilles
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper is concerned with various combinatorial parameters of classes that can be learned from a small set of examples. We show that the recursive teaching dimension, recently introduced by Zilles et al. (2008), is strongly connected to known complexity notions in machine learning, e.g., the self-directed learning complexity and the VC-dimension. To the best of our knowledge these are the first results unveiling such relations between teaching and query learning as well as between teaching and the VC-dimension. It will turn out that for many natural classes the RTD is upper-bounded by the VCD, e.g., classes of VCdimension 1, intersection-closed classes and finite maximum classes. However, we will also show that there are certain (but rare) classes for which the recursive teaching dimension exceeds the VC-dimension. Moreover, for maximum classes, the combinatorial structure induced by the RTD, called teaching plan, is highly similar to the structure of sample compression schemes. Indeed one can transform any repetition-free teaching plan for a maximum class C into an unlabeled sample compression scheme for C and vice versa, while the latter is produced by (i) the corner-peeling algorithm of Rubinstein and Rubinstein (2012) and (ii) the tail matching algorithm of Kuzmin and Warmuth (2007). |
| Researcher Affiliation | Academia | Thorsten Doliwa EMAIL Faculty of Mathematics Ruhr-Universität Bochum, D-44780 Bochum, Germany Gaojian Fan EMAIL Department of Computing Science University of Alberta, Edmonton, AB, T6G 2E8, Canada Hans Ulrich Simon EMAIL Horst-Görtz Institute for IT Security and Faculty of Mathematics Ruhr-Universität Bochum, D-44780 Bochum, Germany Sandra Zilles EMAIL Department of Computer Science University of Regina, Regina, SK, S4S 0A2, Canada |
| Pseudocode | No | The paper describes mathematical concepts, definitions (e.g., Definition 1, Definition 2, Definition 5, Definition 6), lemmas (e.g., Lemma 3, Lemma 7, Lemma 10, Lemma 12, Lemma 13, Lemma 14, Lemma 16, Lemma 22, Lemma 23, Lemma 24, Lemma 27, Lemma 30, Lemma 33) and theorems (e.g., Theorem 11, Theorem 15, Theorem 17, Theorem 20, Theorem 26, Theorem 29, Theorem 31, Theorem 32) along with their proofs. There are no structured pseudocode blocks or algorithms explicitly labeled or formatted as such. |
| Open Source Code | No | The paper makes no explicit mention of open-source code availability, links to code repositories, or code provided in supplementary materials. |
| Open Datasets | No | The paper primarily deals with theoretical concept classes and mathematical constructs (e.g., 'domain X', 'd-dimensional boxes over domain [n]d', 'Warmuth’s class CW'). It does not describe or utilize any empirical datasets for experiments, nor does it provide any links, DOIs, repository names, or formal citations for publicly available datasets. |
| Dataset Splits | No | The paper is theoretical and does not involve experimental evaluation on datasets, therefore, there is no mention of training, validation, or test dataset splits. |
| Hardware Specification | No | As the paper presents theoretical research without empirical experiments, there is no mention of hardware specifications used for computation. |
| Software Dependencies | No | The paper is theoretical and does not describe any implemented systems or experiments, thus it does not list any specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical proofs and definitions, rather than empirical experiments. Therefore, it does not include details on experimental setup, hyperparameters, or system-level training settings. |