reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence

Authors: Yuxin Dong, Haoran Guo, Tieliang Gong, Wen Wen, Chen Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we assess the tightness of our exactly tight Binary JS bound (Corollary 3.10) in comparison to several existing information-theoretic generalization bounds from the literature. These include the Fast-Rate bound (Theorem 4.3, (Wang & Mao, 2023a)), the Binary KL bound (Theorem 5, (Hellstr om & Durisi, 2022b)), and the f-information series of oracle bounds: CMI, CSHI, and CJSI (Theorems 3.1, 3.2, and 3.3, (Wang & Mao, 2024)). Our experimental settings align closely with those in (Wang & Mao, 2024), where we evaluate three distinct classification tasks2: Simple linear classifier on synthetic Gaussian dataset. 4-layer CNN on binarized MNIST (classes 4 vs. 9). Pretrained Res Net-50 model on CIFAR10. [...] The final results, presented in Figure 3, demonstrate that our Binary JS bound fully captures the dynamics of the generalization error.
Researcher Affiliation	Academia	1School of Computer Science and Technology, Xi an Jiaotong University. Correspondence to: Tieliang Gong <EMAIL>.
Pseudocode	No	The paper describes methods mathematically and in text, without presenting any structured pseudocode or algorithm blocks.
Open Source Code	Yes	2https://github.com/Yuxin-Dong/Binary JS.
Open Datasets	Yes	Our experimental settings align closely with those in (Wang & Mao, 2024), where we evaluate three distinct classification tasks2: Simple linear classifier on synthetic Gaussian dataset. 4-layer CNN on binarized MNIST (classes 4 vs. 9). Pretrained Res Net-50 model on CIFAR10.
Dataset Splits	No	Our synthetic experimental settings closely follow those in (Wang & Mao, 2024), where synthetic Gaussian datasets are generated using the scikit-learn package. The task involves training a 1-layer linear classification network on 5-dimensional input data points. [...] In addition, we replicate the experimental settings of (Harutyunyan et al., 2021; Hellstr om & Durisi, 2022b) for two distinct real-world learning tasks: 1) MNIST (4 vs. 9) classification using a 4-layer CNN network, 2) CIFAR10 classification using a pretrained Res Net-50 network. However, the paper does not explicitly state the specific training/validation/test splits with percentages or counts for these datasets in the main text.
Hardware Specification	Yes	The deep learning models are trained with an Intel Xeon CPU (2.10GHz, 48 cores), 256GB memory, and 4 Nvidia Tesla V100 GPUs (32GB).
Software Dependencies	No	Our synthetic experimental settings closely follow those in (Wang & Mao, 2024), where synthetic Gaussian datasets are generated using the scikit-learn package. The paper mentions a software package but does not provide specific version numbers for it or any other key software components.
Experiment Setup	Yes	The model is trained using full-batch gradient descent with a fixed learning rate of 0.01 for 300 epochs. [...] For each learning task, k1 instances of e Z are sampled, and for each e Z, k2 samples of U are drawn, yielding k1 k2 independent runs in total. The values of (k1, k2) are (5, 30) for MNIST and (2, 40) for CIFAR10, respectively. [...] CNN and Res Net-50 models are trained with mini-batch-based iterative learning algorithms such as SGD and SGLD.