Interpreting and Boosting Dropout from a Game-Theoretic View
Authors: Hao Zhang, Sen Li, YinChao Ma, Mingjie Li, Yichen Xie, Quanshi Zhang
ICLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove that dropout can suppress the strength of interactions between input variables of deep neural networks (DNNs). The theoretic proof is also verified by various experiments. |
| Researcher Affiliation | Academia | Hao Zhang Shanghai Jiao Tong University EMAIL Sen Li Sun Yat-sen University EMAIL Yinchao Ma Huazhong University of Science and Technology EMAIL Mingjie Li Shanghai Jiao Tong University EMAIL Yichen Xie Shanghai Jiao Tong University EMAIL Quanshi Zhang Shanghai Jiao Tong University EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper mentions using and referring to third-party codebases (e.g., pytorch-cifar100, suinleelab) but does not state that the authors are releasing their own code for the described methodology. |
| Open Datasets | Yes | MNIST (Lecun et al., 1998), Celeb A(Liu et al., 2015), Tiny Image Net (Le & Yang, 2015), CIFAR-10 dataset (Krizhevsky & Hinton, 2009), SST-2 dataset (Socher et al., 2013) |
| Dataset Splits | No | The paper does not specify exact train/validation/test split percentages or absolute sample counts for each split. It mentions sampling training data but not a clear splitting methodology for reproduction. |
| Hardware Specification | Yes | We trained Alex Net and VGG-11 using the CIFAR-10 dataset on a GPU of Ge Force GTX-1080Ti. |
| Software Dependencies | No | The paper mentions 'Py Torch' implicitly through a reference to 'pytorch-cifar100' but does not specify a version number for PyTorch or any other software dependencies with version numbers. |
| Experiment Setup | Yes | For each DNN, we put the dropout operation and the interaction loss in the low convolutional layer (before the 3rd/5th convolutional layer of the Alex Net/VGGs) and the high fully-connected layer (before the 2nd fully-connected layer), respectively... when we trained DNNs with dropout, we set the dropout rate as 0.5... In this paper, we set α=0.05... Thus, we set the sampling number as 500 in all other experiments in this paper. |