PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis
Authors: Satoki Ishikawa, Makoto Yamada, Han Bao, Yuki Takezawa
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through studying the Phi Net model, we discover two findings. First, meaningful data representations emerge in Phi Net more stably than in Sim Siam. This is initially supported by our learning dynamics analysis: Phi Net is more robust to the representational collapse. Second, Phi Net adapts more quickly to newly incoming patterns in online and continual learning scenarios. For practitioners, we additionally propose an extension called X-Phi Net integrated with a momentum encoder, excelling in continual learning. All in all, our work reveals that the temporal prediction hypothesis is a reasonable model in terms of the robustness and adaptivity. (From Abstract). Section 5: EXPERIMENTS |
| Researcher Affiliation | Academia | Satoki Ishikawa1,2, , Makoto Yamada2, , , Han Bao3,2, Yuki Takezawa2,3 1Science Tokyo, 2Okinawa Institute of Science and Technology, 3Kyoto University. Equal contribution Corresponding author, e-mail:EMAIL |
| Pseudocode | Yes | C PSEUDOCODE FOR PHINET AND X-PHINET The pseudo codes for Phi Net and X-Phi Net are shown in Listing.1 and Listing.2. Listing 1: Phi Net Pseudocode (Py Torch-like) Listing 2: X-Phi Net Pseudocode (Py Torch-like) |
| Open Source Code | Yes | https://github.com/riverstone496/Phi Nets |
| Open Datasets | Yes | Figure 6 and Table 1 show the sensitivity analysis using CIFAR10 (Krizhevsky, 2009) and Image Net (Krizhevsky et al., 2012), respectively... by using the CIFAR-5m dataset (Nakkiran et al., 2021)... In addition, the results for different batch sizes and datasets (STL10 (Coates et al., 2011)) can be found in Appendix E.4. |
| Dataset Splits | Yes | For both CIFAR10 and STL10, we used 50,000 samples for training and 10,000 samples for testing. (Section F.1). For Image Net, we used 1,281,167 samples for training and 100,000 samples for testing. (Section F.2). In split CIFAR10 and split CIFAR5m, we split CIFAR10 and CIFAR-5m into 5 tasks, each of which contains 2 classes. In split CIFAR10 , we split CIFAR100 into 10 tasks, each of which contains 2 classes. (Section F.4). |
| Hardware Specification | Yes | Computational resource GPUs V100 or A100 (Table 19). Computational resource GPUs 4 V100 (Table 20). |
| Software Dependencies | No | Listing 1: Phi Net Pseudocode (Py Torch-like). We have implemented it based on code that is already publicly available1. (Section F.1) The paper does not specify version numbers for PyTorch or any other libraries used for the methodology. |
| Experiment Setup | Yes | Table 19: The experimental setups of Figure 6, Table 9, Table 11, Table 12 and Table 13. Optimiser SGD Momentum 0.9 Learning Rate 0.03 Epochs 800 Encoder Backbone Res Net18_cifar_variant1 Projector output dimension 2048 Predictor h Latent dimension m 2048 Hidden dimension h 512 Activation function Re LU Batch normalization Yes Predictor g Latent dimension m 2048 Hidden dimension g 512 Activation function (Hidden) Re LU Activation function (Output) Tanh Batch normalization Yes Computational resource GPUs V100 or A100. (Similar details in Table 20 for ImageNet). |