reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models

Authors: Mengyang Sun, Yihao Wang, Tao Feng, Dan Zhang, Yifan Zhu, Jie Tang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a series of comparative experiments to evaluate the performances of Mo E-Lo RA across various downstream tasks (Zhang et al., 2024b; Luo et al., 2025b) including Question Answering, the GLUE Benchmark, and the Vision-Language task. ... We implement and examine our rescaling approach for Mo E-Lo RA under a series of foundation models, illustrating our effectiveness across various tasks. ... Finally, to lend support to our theoretical foundation, we conduct an ablation study by assessing our forwarding revisions only under a classic optimizer without Riemannian preconditioners support.
Researcher Affiliation	Academia	1Department of Computer Science and Technology, Tsinghua University, Beijing, China; 2Computer School, Beijing Information Science and Technology University, Beijing, China; 3School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China.
Pseudocode	Yes	Algorithm 1 Engineering Alternative Solution of Gate-based Rescaling Method def forward(self, x, ...): ... # compute gate values gvs = ... ... # execute each activated expert for exp_id in activated_experts: A = self.As[exp_id] B = self.Bs[exp_id] gv = gvs[:,:,exp_id] exp_out = B(A(x)) sqrt_gv = (gv*0.5).detach() # update 1 w_exp_out = sqrt_gvexp_out+(gv-sqrt_gv)*exp_out.detach() # update 2 result = result + w_exp_out ...
Open Source Code	Yes	Source code is available at https://github. com/THUDM/Mo ELo RA_Riemannian.
Open Datasets	Yes	We evaluate our proposed method on several questionanswering benchmarks, including Science QA (Lu et al., 2022), Commonsense QA (Talmor et al., 2019), Open Book QA (Mihaylov et al., 2018) and SIQA (Sap et al., 2019). ... GLUE (Wang et al., 2019) ... For evaluation, Visual7W (Zhu et al., 2016) and VMCBench (Zhang et al., 2025b) datasets are employed
Dataset Splits	Yes	For VMCBench, we only use their dev set since their test set is not labeled. We take 900 of all the 1,000 labeled samples as training samples, while the rest 100 are for evaluation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For most experiments, unless otherwise specified, we construct a mixture of Lo RAs modules with a total of 20 experts, a rank of 4 for each expert, and a selection of top-10 experts activated each time. ... During training, we follow a linear decay learning-rate scheduler. We assign a relatively smaller learning rate to gate module compared to other trainable components, to achieve a stable training behavior. ... Table 12. Default experimental details implemented throughout this paper. All experiments follow this configuration unless they specify their particular settings...