Robust Learning against Relational Adversaries
Authors: Yizhen Wang, Mohannad Alhanahnah, Xiaozhu Meng, Ke Wang, Mihai Christodorescu, Somesh Jha
NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results of both tasks show our learning framework significantly improves the robustness of models against relational adversaries. In the process, it outperforms adversarial training, the most noteworthy defense mechanism, by a wide margin. We now evaluate the effectiveness of N&P against relational attacks for real-world attacks. Our empirical evaluation shows that input normalization can significantly enhance model robustness. |
| Researcher Affiliation | Collaboration | Yizhen Wang Visa Research EMAIL Mohannad Alhanahnah University of Wisconsin Madison EMAIL Xiaozhu Meng Rice University EMAIL Ke Wang Visa Research EMAIL Mihai Christodorescu Visa Research EMAIL Somesh Jha University of Wisconsin Madison EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The normalizer in our paper is now open sourced on https://github.com/ Mohannadcse/Normalizer-authorship. |
| Open Datasets | Yes | We use the dataset provided by Quiring et al. [2019], which is collected from Google Code Jam, 6https://github.com/EQuiw/code-imitator/tree/master/data/dataset_2017. |
| Dataset Splits | Yes | We sample 19,000 benign PEs and 19,000 malicious PEs to construct the training (60%), validation (20%), and test (20%) sets. |
| Hardware Specification | Yes | The standard adversarial training is too computationally expensive for the attack on source code level. We make a number of adaptations that reduce the number of MCTS roll-outs and generate adversarial examples in batch for better parallelism so that the process finishes within a month on a 72-core CPU server. |
| Software Dependencies | No | The paper mentions software like LIEF and Clang, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We use the same network architecture as Al-Dujaili et al. [2018], a fully-connected neural net with three hidden layers, each with 300 ReLU nodes, to set up a fair comparison. We train each model to minimize the negative log-likelihood loss for 20 epochs, and pick the version with the lowest validation loss. |