The Expressive Power of Neural Networks: A View from the Width
Authors: Zhou Lu, Hongming Pu, Feicheng Wang, Zhiqiang Hu, Liwei Wang
NeurIPS 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further conduct extensive experiments to provide some insights about the upper bound of such an approximation. To this end, we study a series of network architectures with varied width. For each network architecture, we randomly sample the parameters... The approximation error is empirically calculated... Table 1 lists the results. |
| Researcher Affiliation | Academia | Zhou Lu1,3 EMAIL Hongming Pu1 EMAIL Feicheng Wang1,3 EMAIL Zhiqiang Hu2 EMAIL Liwei Wang2,3 EMAIL 1, Department of Mathematics, Peking University 2, Key Laboratory of Machine Perception, MOE, School of EECS, Peking University 3, Center for Data Science, Peking University, Beijing Institute of Big Data Research |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It refers to a figure and describes a construction informally, but no formal algorithm steps. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. There are no explicit statements about releasing code, nor are there any repository links. |
| Open Datasets | No | The paper describes generating its own 'uniformly placed inputs' and sampling parameters, but it does not provide concrete access information (link, DOI, repository name, or formal citation with authors/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper states, 'half of all the test inputs from [ 1, 1)n and the corresponding values evaluated by target function constitute the training set.' This describes a training split, but it does not mention a distinct validation set or specify a comprehensive train/validation/test split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'mini-batch Ada Delta optimizer' but does not provide specific version numbers for any software components, which is required for reproducibility. |
| Experiment Setup | Yes | The training set is used to train approximator network with a mini-batch Ada Delta optimizer and learning rate 1.0. The parameters of approximator network are randomly initialized according to [8]. The training process proceeds 100 epoches for n = 1 and 200 epoches for n = 2; the best approximator function is recorded. |