Hyperparameter Learning via Distributional Transfer
Authors: Ho Chung Law, Peilin Zhao, Leung Sing Chan, Junzhou Huang, Dino Sejdinovic
NeurIPS 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, across a range of regression and classification tasks, our methodology performs favourably at initialisation and has a faster convergence compared to existing baselines in some cases, the optimal accuracy is achieved in just a few evaluations. |
| Researcher Affiliation | Collaboration | Ho Chung Leon Law University of Oxford EMAIL Peilin Zhao Tencent AI Lab EMAIL Lucian Chan University of Oxford EMAIL Junzhou Huang Tencent AI Lab EMAIL Dino Sejdinovic University of Oxford EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It mentions using TensorFlow for implementation but does not share its own code. |
| Open Datasets | Yes | In particular, the Protein dataset consists of 7 different proteins extracted from [9]: ADAM17, AKT1, BRAF, COX1, FXA, GR, VEGFR2. |
| Dataset Splits | Yes | For testing, we use the same number of samples si for toy data, while using a 60-40 train-test split for real data. |
| Hardware Specification | Yes | Training time is less than 2 minutes on a standard 2.60GHz single-core CPU in all experiments. |
| Software Dependencies | No | The paper mentions using 'Tensor Flow [1] for implementation' and 'Sci Py [14]', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For φx and φy we will use a single hidden layer NN with tanh activation (with 20 hidden and 10 output units), except for classification tasks, where we use a one-hot encoding for φy. [...] For BLR, we will follow [26] and take feature map υ to be a NN with three 50-unit layers and tanh activation. [...] We take the embedding batch-size b = 1000, and learning rate for ADAM to be 0.005. |