Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Doubly infinite residual neural networks: a diffusion process approach
Authors: Stefano Peluchetti, Stefano Favaro
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we review the results of Peluchetti and Favaro (2020), extending them to convolutional Res Nets, and we establish analogous backward-propagation results, which directly relate to the problem of training fully-connected deep Res Nets. Then, we investigate the more general setting of doubly infinite neural networks... In Section 6 we present numerical experiments, and in Section 7 we discuss our work and directions for future work. |
| Researcher Affiliation | Collaboration | Stefano Peluchetti EMAIL Cogent Labs 106-0032, Tokyo, Japan. Stefano Favaro EMAIL Department of Economics and Statistics University of Torino and Collegio Carlo Alberto 10122, Torino, Italy. |
| Pseudocode | No | The paper primarily focuses on mathematical derivations of stochastic differential equations and their properties. It describes procedures using mathematical notation and theoretical explanations, but it does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps typically found in computational algorithms. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to code repositories. Mentions of 'Attribution requirements are provided at http://jmlr.org/papers/v22/20-706.html' refer to paper attribution, not code availability. |
| Open Datasets | Yes | We consider the MNIST dataset (Le Cun, 1998). |
| Dataset Splits | Yes | We consider 20.000 randomly sampled observations from the training portion of the MNIST dataset, and we compute the test accuracy on the test portion of the MNIST dataset, which is composed of 10.000 observations. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or cloud computing environments. It only briefly mentions limiting model size 'For computation reasons'. |
| Software Dependencies | No | The paper mentions training methods like 'full-batch GD training', 'SGD training', and 'Adam (Kingma and Ba, 2015)', but it does not specify any software frameworks (e.g., PyTorch, TensorFlow) or their version numbers. |
| Experiment Setup | Yes | We consider Ftanh trained via full-batch GD training and average MSE loss... We consider 20.000 randomly sampled observations... We consider 120 epochs. We use a single learning rate tuned to optimize final test accuracy... we perform SGD training of Ftanh, with batches of 200 observations each... We consider Ftanh with σ2 w = 1 and σ2 b = 0.12. The use of a smaller bias variance is common in the NTK literature (Arora et al., 2019). For numerical stability the model is augmented with a small noise variance equal to 1/20.000... |