Dimension Estimation Using Random Connection Models

Authors: Paulo Serra, Michel Mandjes

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A simulation study on both real and simulated data shows that our approach compares favourably with some competing methods from the literature, including approaches that rely on distance information. In Section 7 we present some numerical illustrations for our method, and we propose our bias corrected estimator.
Researcher Affiliation Academia Paulo Serra EMAIL Department of Mathematics and Computer Science Groene Loper 5 Meta Forum Building Eindhoven University of Technology 5612 AZ Eindhoven, the Netherlands; Michel Mandjes EMAIL Korteweg-de Vries Institute for Mathematics Science Park 105 107 University of Amsterdam 1098 XG Amsterdam, the Netherlands
Pseudocode No The paper describes the estimation process and formulas in detail, particularly in Section 4 'Estimation of the Intrinsic Dimension', but does not include a distinct pseudocode block or algorithm listing.
Open Source Code No The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository or supplementary materials for code.
Open Datasets Yes We consider twelve data sets; seven consist of simulated data, and five of real data. The Isomap faces data set4 contains 698 images (D = 64 × 64 pixels) of a rendered face of a sculpture taken from different angles, under different lighting conditions. The Hands data set5 contains 481 frames (D = 512 × 480 pixels) from a video of a hand holding a rice bowl and revolving it while moving from right to left. The MNIST data sets6 contain 7141, 6824, and 6313 images (D = 28 × 28 pixels) of handwritten digits 3, 4, and 5, respectively.
Dataset Splits No The paper mentions using simulated data and real datasets (Isomap faces, Hands, MNIST) but does not specify training/test/validation splits. For the MNIST dataset, it states the total number of images for specific digits (e.g., '7141, 6824, and 6313 images'), but no split methodology is described.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models, or memory specifications.
Software Dependencies No The paper does not specify any software dependencies with version numbers used for the implementation or experiments.
Experiment Setup Yes We set the distribution of the design points X ~ Nd(0, I), for d ∈ {1, 2, 3, 4, 5, 10}, and chose n ∈ {10^3, 10^4, 10^5, 10^6, 10^7}; irrespectively of the dimension we always set ϵ = ϵn = 4/(log n)^1/2. Based on the discussion from the previous section, the parameter mn was set to max(1, log n).