reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Authors: Xu Zhang, Kaidi Xu, Ziqing Hu, Ren Wang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on CIFAR-10 and Tiny Image Net datasets using Res Net18 and Vision Transformer (Vi T) architectures demonstrate the effectiveness of our proposed methods.
Researcher Affiliation	Collaboration	1Illinois Institute of Technology 2Drexel University 3Perplexity AI. Correspondence to: Ren Wang <EMAIL>.
Pseudocode	Yes	Algorithm 1 The JTDMo E algorithm
Open Source Code	Yes	The code is publicly available at https://github.com/TIML-Group/ Robust-Mo E-Dual-Model.
Open Datasets	Yes	Experimental results on CIFAR-10 and Tiny Image Net datasets using Res Net18 and Vision Transformer (Vi T) architectures demonstrate the effectiveness of our proposed methods.
Dataset Splits	No	The paper uses CIFAR-10 and Tiny Image Net datasets but does not explicitly state the training/validation/test splits (e.g., percentages or counts) used for these datasets.
Hardware Specification	No	We are thankful for the computational resources made available through NSF ACCESS and Argonne Leadership Computing Facility.
Software Dependencies	No	The paper does not explicitly state specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We train Res Net18-based Mo E for 130 epochs on CIFAR-10 and fine-tune pre-trained Vi T-small-based Mo E for 10 epochs on Tiny Image Net. A Cyclic Learning Rate strategy (Smith, 2017), starting at 0.0001, and data augmentation (Rebuffi et al., 2021) are used to enhance performance. The hyperparameter β controls the trade-off between Mo E-wide robustness and expert-specific robustness. (Table 9 shows values for beta: 1, 3, 6, 9). We use PGD (Madry et al., 2017) and Auto Attack (Croce & Hein, 2020) to assess model performance under adversarial conditions, with ϵ = 8/255 for CIFAR-10 and ϵ = 2/255 for Tiny Image Net. ... Evaluation is done using either a 50-step PGD or Auto Attack with the same step size.