Regularizing Energy among Training Samples for Out-of-Distribution Generalization
Authors: Yiting Chen, Qitian Wu, Junchi Yan
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on long-tail datasets, subpopulation shift benchmarks, and OOD generalization benchmarks to show the effectiveness of the proposed energy regularization. |
| Researcher Affiliation | Academia | 1Sch. of Computer Science & Sch. of Artificial Intelligence, Shanghai Jiao Tong University 2Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and mathematical derivations but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using a third-party Python package for influence function calculation: "We implement the calculation of the influence function based on the Python package for calculating the influence function (Lo & Bae, 2022). URL https://github.com/alstonlo/torch-influence.". However, it does not provide an explicit statement or link for the authors' own source code for the methodology described in this paper. |
| Open Datasets | Yes | We evaluate IAER on the imbalanced version of CIFAR10, CIFAR100 (Cui et al., 2019) and Image Net-LT (Liu et al., 2019) that are artificially created with class imbalance and i Naturalist 2018 (Van Horn et al., 2018), a naturally long-tailed dataset. [...] We conduct our experiments on widely used datasets in Subpop Bench, including Colored MNIST (Arjovsky et al., 2019), Meta Shift cats vs. dogs (Liang & Zou, 2022), NICO++ (Zhang et al., 2023). [...] Experiments are performed on benchmarks Colored MNIST (Arjovsky et al., 2019), PACS (Li et al., 2017) and VLCS (Fang et al., 2013). |
| Dataset Splits | Yes | CIFAR10 and CIFAR100 both contain 50, 000 images in training and 10, 000 images in testing with 10 and 100 classes, respectively. We construct the imbalanced version of CIFAR10 and CIFAR100 by reducing the number of images for each class. [...] For a fair comparison, the validation set used to calculate the influence function is sampled from the training set, and the models are not exposed to testing data during training. [...] For Image Net-LT and i Naturalist, we take images of Many-shot, Medium-shot, and Few-shot classes in the training set as the validation set, respectively. [...] We follow the setting in Gulrajani & Lopez-Paz (2021) to conduct experiments and use the training domain validation set to calculate the influence function. |
| Hardware Specification | Yes | We tested the time cost for calculating the influence function for Res Net-32 on CIFAR10 with Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz and one Ge Force RTX 2080Ti for 5000 iteration and averaged ten times. |
| Software Dependencies | No | The paper mentions using a "Python package for calculating the influence function (Lo & Bae, 2022)", but does not provide specific version numbers for Python or any other libraries/frameworks used. |
| Experiment Setup | Yes | The model is trained for 200 epochs with SGD optimizer where the learning rate is at 0.1, momentum at 0.9, and weight decay at 2e 4. The learning rate is decayed with factor 0.01 at 160-th epoch and 180-th epoch. For IAER, we finetune the model for 5 epochs with batch size at 128 and learning rate at 1e 4. [...] For Image Net-LT, the classifier is finetuned for 10 epochs with batch size at 512 and learning rate at 0.2. For i Naturalist, the classifier is finetuned for 30 epochs with batch size at 512 and learning rate at 0.2. The γ in Eq. 9 is searched in {0.1, 0.5, 1, 10} for CIFAR-LT and set to be 0.5 for Image Net-LT and i Naturalist2018. |