reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Geometry of Long-Tailed Representation Learning: Rebalancing Features for Skewed Distributions

Authors: Lingjie Yi, Michael Yao, Weimin Lyu, Haibin Ling, Raphael Douady, Chao Chen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the effectiveness of our method through extensive experiments on the CIFAR-10-LT, CIFAR-100-LT, Image Net-LT, and i Naturalist 2018 datasets. ... We validate our method with experiments on commonly used datasets. Feat Recon outperforms SOTA performance compared to widely adopted long-tailed learning baselines.
Researcher Affiliation	Academia	Lingjie Yi1 , Jiachen Yao1, Weimin Lyu1, Haibin Ling1, Raphael Douady2, Chao Chen1 1 Stony Brook University, 2 University Paris 1 Pantheon-Sorbonne
Pseudocode	Yes	A.1 PSEUDO ALGORITHMS A.1.1 TRAINING PROCESS In this section, we first present the training procedure of Feat Recon. ... Algorithm 1: Feat Recon Algorithm ... A.1.2 SYNTHETIC FEATURE GENERATION ... Algorithm 2: Synthetic Feature Generation Algorithm
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code, nor does it provide a link to a code repository or mention code in supplementary materials.
Open Datasets	Yes	We validate the effectiveness of our method through extensive experiments on the CIFAR-10-LT, CIFAR-100-LT, Image Net-LT, and i Naturalist 2018 datasets. ... CIFAR-10-LT and CIFAR-100-LT are the imbalanced subsets of the original CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009)... Image Net-LT (Liu et al., 2019) is the subset of the original Image Net (Deng et al., 2009)... i Naturalist 2018 (Van Horn et al., 2018) is a large-scale long-tailed dataset
Dataset Splits	Yes	CIFAR-10-LT and CIFAR-100-LT are the imbalanced subsets of the original CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009), following (Kang et al., 2021; Li et al., 2022; Zhu et al., 2022). ... Image Net-LT (Liu et al., 2019) is the subset of the original Image Net (Deng et al., 2009)... We divide the testing sets into three subsets: many (with more than 100 instances), medium (with 20 to 100 instances), and few (with less than 20 instances) splits.
Hardware Specification	No	The paper does not specify any particular GPU models, CPU types, or other hardware components used for running the experiments. It only mentions using ResNet-32, ResNet-50, and ResNeXt50 as backbones, which are model architectures, not hardware.
Software Dependencies	No	The paper mentions using specific model backbones like Res Net-32, Res Net-50, and Res Ne Xt50-32x4d, along with optimizers (SGD) and data augmentation techniques (Auto Aug, Cutout, Sim Aug). However, it does not provide specific version numbers for any of these software components, frameworks (like PyTorch or TensorFlow), or other libraries.
Experiment Setup	Yes	For both CIFAR-10-LT and CIFAR-100-LT, we adopt the Res Net-32 as the backbone. ... Our model is trained for 200 epochs with a batch size of 256 and with a SGD optimizer. The momentum is 0.9 and the weight decay is 5e-4. The learning rate warms up 0.15 in the first 5 epochs and decay by 0.1 at the 160th and 180th epochs. ... For hyperparameters, we set λx = 2.0, λc = 0.6, α = 0.99, and τ = 0.1, τ+ is scheduled by training epoch between 0 and 1 using method in Kukleva et al. (2023). ... Our model is trained for 90 epochs for Image Net-LT and 100 for i Naturalist 2018 epochs with a batch size of 256 and with a SGD optimizer. The momentum is 0.9 and the weight decay is 5e-4 for Image Net-LT and 1e-4 for i Naturalist 2018. The learning rate is 0.1 for Image Net-LT and 0.2 for i Naturalist 2018 with a cosine scheduler. ... For hyperparameters, we set λx = 1, λc = 0.35, α = 0.99, and τ = 0.07, τ+ is scheduled by training epoch between 0.07 and 1 using method in Kukleva et al. (2023).