reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Representation Norm Amplification for Out-of-Distribution Detection in Long-Tail Learning

Authors: Dong Geun Shin, Hye Won Chung

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that RNA achieves superior performance in both OOD detection and classification compared to the state-of-the-art methods, by 1.70% and 9.46% in FPR95 and 2.43% and 6.87% in classification accuracy on CIFAR10-LT and Image Net-LT, respectively.
Researcher Affiliation	Academia	Dong Geun Shin EMAIL School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) Hye Won Chung EMAIL School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST)
Pseudocode	Yes	C Pseudocode The training scheme of Representation Norm Amplification (RNA) is shown in Algorithm 1, and evaluation scheme of Representation Norm (RN) is shown in Algorithm 2.
Open Source Code	Yes	The code for this work is available at https://github.com/dgshin21/RNA.
Open Datasets	Yes	Datasets and training setup For ID datasets, we use CIFAR10/100 (Krizhevsky, 2009) and Image Net-1k (Deng et al., 2009). The long-tailed training sets, CIFAR10/100-LT, are built by downsampling CIFAR10/100 (Cui et al., 2019)... For experiments on CIFAR10/100, we use the semantically coherent out-of-distribution (SC-OOD) benchmark datasets (Yang et al., 2021) as our OOD test sets. For CIFAR10 (respectively, CIFAR100), we use Textures (Cimpoi et al., 2014), SVHN (Netzer et al., 2011), CIFAR100 (respectively, CIFAR10), Tiny Image Net (Le & Yang, 2015), LSUN (Yu et al., 2015), and Places365 (Zhou et al., 2018) from the SC-OOD benchmark.
Dataset Splits	Yes	The long-tailed training sets, CIFAR10/100-LT, are built by downsampling CIFAR10/100 (Cui et al., 2019), making the imbalance ratio, maxc N(c)/ minc N(c), equal to 100, where N(c) is the number of samples in class c. ... For experiments on CIFAR10/100, we use the semantically coherent out-of-distribution (SC-OOD) benchmark datasets (Yang et al., 2021) as our OOD test sets. For CIFAR10 (respectively, CIFAR100), we use Textures (Cimpoi et al., 2014), SVHN (Netzer et al., 2011), CIFAR100 (respectively, CIFAR10), Tiny Image Net (Le & Yang, 2015), LSUN (Yu et al., 2015), and Places365 (Zhou et al., 2018) from the SC-OOD benchmark. For Image Net, we use Image Net-LT (Liu et al., 2019) as the long-tailed training set and Image Net-Extra (Wang et al., 2022b) as the auxiliary OOD training set. For Image Net, we use Image Net-1k-OOD (Wang et al., 2022b), as our OOD test set. All the OOD training sets and OOD test sets are disjoint.
Hardware Specification	Yes	We run the experiments on NVIDIA A6000 GPUs. For CIFAR10/100-LT datasets, we use a single GPU for each experimental run, and the entire training process takes about an hour. For Image Net-LT, we use 7 GPUs for each experimental run, and the entire training process takes about 4 and a half hours.
Software Dependencies	No	No specific versions for software dependencies (e.g., Python, PyTorch, TensorFlow, etc.) are provided in the paper. It only references optimizers and schedulers by their originators: "Adam optimizer (Kingma & Ba, 2014)" and "cosine learning scheduler (Loshchilov & Hutter, 2017)".
Experiment Setup	Yes	We train Res Net18 (He et al., 2015) on CIFAR10/100-LT datasets for 200 epochs with a batch size of 128, and Res Net50 (He et al., 2015) on Image Net-LT for 100 epochs with a batch size of 256. We optimize the model parameters with Adam optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.001 and we decay the learning rate with the cosine learning scheduler (Loshchilov & Hutter, 2017). The weight decay parameter is set to 0.0005. We set the balancing hyperparameter λ = 0.5 for all the experiments in Section 6.