reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout

Authors: Zhao Chen, Jiquan Ngiam, Yanping Huang, Thang Luong, Henrik Kretzschmar, Yuning Chai, Dragomir Anguelov

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that Grad Drop outperforms the state-of-the-art multiloss methods within traditional multitask and transfer learning settings, and we discuss how Grad Drop reveals links between optimal multiloss training and gradient stochasticity.
Researcher Affiliation	Industry	Zhao Chen Waymo LLC Mountain View, CA 94043 EMAIL Jiquan Ngiam Google Research Mountain View, CA 94043 EMAIL Yanping Huang Google Research Mountain View, CA 94043 EMAIL Thang Luong Google Research Mountain View, CA 94043 EMAIL Henrik Kretzschmar Waymo LLC Mountain View, CA 94043 EMAIL Yuning Chai Waymo LLC Mountain View, CA 94043 EMAIL Dragomir Anguelov Waymo LLC Mountain View, CA 94043 EMAIL
Pseudocode	Yes	Algorithm 1 Gradient Sign Dropout Layer (Grad Drop Layer)
Open Source Code	No	The paper does not provide an unambiguous statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We also rely exclusively on standard public datasets, and thus move discussion of most dataset properties to the Appendices. [...] We first test Grad Drop on the multitask learning dataset Celeb A [26] [...] We transfer Image Net2012 [5] to CIFAR-100 [21] [...] 3D vehicle detection from point clouds on the Waymo Open Dataset [42].
Dataset Splits	No	The paper states that it 'relies exclusively on standard public datasets' and conducts 'training runs', but does not explicitly provide specific details on the train/validation/test dataset splits (e.g., percentages, sample counts, or a detailed splitting methodology) required for reproduction.
Hardware Specification	Yes	All experiments are run on NVIDIA V100 GPU hardware.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with versions like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	We will provide relevant hyperparameters within the main text, but we relegate a complete listing of hyperparameters to the Appendix. For many of our experiments, we renormalize the final gradients so that \|\|r\|\|2 remains constant throughout the Grad Drop process. For our final Grad Drop model we use a leak parameter i set to 1.0 for the source set. All runs include gradient clipping at norm 1.0.