Revisiting Mode Connectivity in Neural Networks with Bezier Surface
Authors: Jie Ren, Pin-Yu Chen, Ren Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our method on CIFAR-10, CIFAR-100, and Tiny-Image Net datasets using VGG16, Res Net18, and Vi T architectures. The codes are available at https://github.com/TIML-Group/MCSurface. |
| Researcher Affiliation | Collaboration | Jie Ren1,2, Pin-Yu Chen3, Ren Wang1 1Illinois Institute of Technology, 2University of Wisconsin-Madison, 3IBM Research |
| Pseudocode | Yes | Algorithm 1: Bezier Surface Mode Connectivity Algorithm (Summary) Algorithm 2: Bezier Surface Mode Connectivity Algorithm |
| Open Source Code | Yes | The codes are available at https://github.com/TIML-Group/MCSurface. |
| Open Datasets | Yes | We evaluate our method on three different datasets including CIFAR-10 (Krizhevsky & Hinton, 2009), CIFAR-100 (Krizhevsky & Hinton, 2009), and Tiny Imagenet (Le & Yang, 2015) |
| Dataset Splits | No | The paper uses standard datasets like CIFAR-10, CIFAR-100, and Tiny Imagenet, and mentions evaluating on 'test data' and 'training data'. However, it does not explicitly specify the training/test/validation split percentages, sample counts, or refer to a specific predefined split with citations for these experiments in the main text. |
| Hardware Specification | Yes | The experiments were performed on a single NVIDIA 4090 GPU sampling 80 points per batch on the surface. |
| Software Dependencies | No | The paper does not explicitly mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | Algorithm 2: Bezier Surface Mode Connectivity Algorithm Input: Initial weights θ00, θnm, θ0m, and θn0 (fixed four end control points), number of epochs E1, E2, with epoch E = E1 + E2, learning rate η, number of random samples k, training dataset D0, batch size B |