Generalization Analysis for Deep Contrastive Representation Learning

Authors: Nong Minh Hieu, Antoine Ledent, Yunwen Lei, Cheng Yeaw Ku

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To compare our results with previous works, we conducted experiments by training fully-connected deep neural networks with the MNIST digits dataset (Le Cun, Cortes, and Burges 2010) with a train-test ratio of 75%/25%. We ran two ablation studies to test how our bounds vary with network depth and hidden layer dimension compared to the bounds proposed by Arora et al. (2019) and Lei et al. (2023). ... A summary of our experiment results is presented in figure 1: the y axis shows the main factor in our and competing bounds, ignoring constants and logarithmic terms in all cases. The results demonstrate that our generalization bounds outperform the competing ones, especially for larger widths and depths.
Researcher Affiliation Academia Nong Minh Hieu 1, 2, Antoine Ledent 2, Yunwen Lei 3, Cheng Yeaw Ku 1 1School of Physical and Mathematical Science, Nanyang Technological University, Singapore 639798 2School of Computing and Information Systems, Singapore Management University, Singapore 188065 3Department of Mathematics, University of Hong Kong, Pok Fu Lam, Hong Kong EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes theoretical frameworks and mathematical derivations but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing code, a link to a code repository, or mentions of code in supplementary materials.
Open Datasets Yes Experiments To compare our results with previous works, we conducted experiments by training fully-connected deep neural networks with the MNIST digits dataset (Le Cun, Cortes, and Burges 2010) with a train-test ratio of 75%/25%.
Dataset Splits Yes Experiments To compare our results with previous works, we conducted experiments by training fully-connected deep neural networks with the MNIST digits dataset (Le Cun, Cortes, and Burges 2010) with a train-test ratio of 75%/25%.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies No The paper does not specify any software names with version numbers that were used for the experiments.
Experiment Setup Yes For the first experiment, we fixed the hidden layer dimensions to 64 and trained deep neural networks at different depths in the [2, 10] range. For the second experiment, we fixed the depth to L = 3 and trained deep neural networks at different hidden layer dimensions of 32, 64, 128, ... (in multiples of 32). In both experiments, we fixed the output dimension to d = 64 and the number of negative samples to k = 10 (furthermore, additional experiments with k = 64 are provided in Appendix J of the full Ar Xiv version). For all the neural networks trained in both experiments, we set the maximum number of training iterations to 1000 and stopped until the empirical unsupervised loss reached 1e 4 to ensure that all networks roughly converge to the empirical risk minimizers.