Evolving Decomposed Plasticity Rules for Information-Bottlenecked Meta-Learning
Authors: Fan Wang, Hao Tian, Haoyi Xiong, Hua Wu, Jie Fu, Yang Cao, Yu Kang, Haifeng Wang
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our algorithms are tested in challenging random 2D maze environments, where the agents have to use their past experiences to shape the neural connections and improve their performances for the future. The results of our experiment validate the following |
| Researcher Affiliation | Collaboration | 1Baidu Inc. 2University of Science and Technology of China 3Beijing Academy of Artificial Intelligence |
| Pseudocode | Yes | Algorithm 1 Inner-Loop Learning |
| Open Source Code | Yes | source code available at https://github.com/WorldEditors/EvolvingPlasticANN |
| Open Datasets | Yes | We validate the proposed method in Meta Maze2D (Wang, 2021), an open-source maze simulator that can generate maze architectures, start positions, and goals at random. |
| Dataset Splits | Yes | For meta-training, each generation includes g = 360 genotypes evaluated on |Ttra| = 12 tasks. ... Every 100 generations we add a validating phase by evaluating the current genotype in |Tvalid| = 1024 (validating tasks). ... The testing tasks include 9x9 mazes (Figure 3 (a)), 15x15 mazes (Figure 3 (b)), and 21x21 mazes (Figure 3 (c)) sampled in advance. There are |Ttst| = 2048 tasks for each level of mazes. |
| Hardware Specification | Yes | The genotypes are distributed to 360 CPUs to execute the inner loops. |
| Software Dependencies | No | The paper mentions a simulator 'Meta Maze2D (Wang, 2021)' and other general concepts like 'plastic RNN', 'LSTM', but does not specify versions for any programming languages, libraries, or frameworks used in the implementation. |
| Experiment Setup | Yes | For meta-training, each generation includes g = 360 genotypes evaluated on |Ttra| = 12 tasks. ... The variance of the noises in Seq-CMA-ES is initially set to be 0.01. ... Meta training goes for at least 15,000 generations... The agents acquire the reward of 1.0 by reaching the goal and 0.01 in other cases. Each episode terminates when reaching the goal, or at the maximum of 200 steps. A life cycle has totally 8 episodes. ... For the outer-loop optimizer (seq-CMA-ES), we used an initial step size of 0.01, and the covariance C = I for all the compared methods. |