Toward Understanding Convolutional Neural Networks from Volterra Convolution Perspective
Authors: Tenghui Li, Guoxu Zhou, Yuning Qiu, Qibin Zhao
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this subsection, we validate the approximation of two simple structures (truncated version of Lemma 12 and Lemma 13), which are two fundamental structures in our proof and which can show the effectiveness and correctness of our results. We randomly generate x M64, h M9 and g M5. Convolution with sigmoid activation is plotted in Figure 6a and the approximated order-five Volterra convolution is plotted in Figure 6b. The reconstruct error (L2-norm) between left and right is 5.64877e-07. ... To validate the effectiveness of hacking network, we build a network and compute its order-zero and order-one proxy kernels manually. Then, we train a hacking network to infer these kernels and compare whether they are the same. ... After 512 training episodes, the network achieved 98.280% accuracy on test set. |
| Researcher Affiliation | Academia | Tenghui Li EMAIL School of Automation, Guangdong University of Technology, Guangzhou, China ... Guoxu Zhou EMAIL School of Automation, Guangdong University of Technology, Guangzhou, China ... Yuning Qiu EMAIL Guangdong Key Laboratory of Io T Information Technology, Guangdong University of Technology, Guangzhou, China ... Qibin Zhao EMAIL RIKEN Center for Advanced Intelligence Project, Tokyo, Japan School of Automation, Guangdong University of Technology, Guangzhou, China |
| Pseudocode | No | The paper provides mathematical definitions, theorems, lemmas, and proofs, but does not include any clearly labeled pseudocode or algorithm blocks with structured, code-like steps. |
| Open Source Code | Yes | All codes associated with this article are accessible publicly at Git Hub https://github. com/tenghuilee/nnvolterra.git. |
| Open Datasets | Yes | First, we build a hacking network to approximate the order-one proxy kernel of a classifier network trained on the MNIST data set (Lecun et al., 1998). |
| Dataset Splits | No | The paper mentions "Training images" and "test set" for the MNIST dataset. It also states "we randomly choose twelve images from the test set". However, it does not provide specific percentages or counts for how the dataset was split into training, validation, or test sets, nor does it explicitly refer to a standard split with precise details. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as GPU models, CPU types, or other computer specifications. |
| Software Dependencies | No | The paper implicitly suggests the use of PyTorch through a footnote reference (pytorch.org/docs/stable/generated/torch.nn.functional.conv_transpose1d.html), but it does not specify version numbers for PyTorch, Python, or any other software libraries or dependencies. |
| Experiment Setup | No | The paper mentions some aspects of the training process, such as "Training images are all scaled to [0, 1]", noise addition, "512 training episodes", and minimizing "mean square error". However, it lacks specific details regarding key hyperparameters like learning rate, batch size, or the optimizer used for training. |