reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Advancing Personalized Learning with Neural Collapse for Long-Tail Challenge

Authors: Hanglei Hu, Yingying Guo, Zhikang Chen, Sen Cui, Fei Wu, Kun Kuang, Min Zhang, Bo Jiang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that NCAL effectively enhances existing works, achieving new state-of-the-art performance. Additionally, NCAL mitigates class imbalance, significantly improving the model s generalization ability. Code is available at https://github.com/llm4edu/ NCAL_ICML2025.git.
Researcher Affiliation	Academia	1East China Normal University 2Tsinghua University 3Zhejiang University. Correspondence to: Min Zhang <EMAIL>, Bo Jiang <EMAIL>.
Pseudocode	No	The paper describes methods and equations but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/llm4edu/ NCAL_ICML2025.git.
Open Datasets	No	Experiments are demonstrated on two personalized learning datasets. The TMWPL dataset is designed to evaluate students cognitive abilities, while the PMTD dataset focuses on classifying teacher-student dialogue behaviors. Examples from each dataset are shown in Figure 3. As illustrated in Figure 4, both datasets exhibit a characteristic long-tail distribution in sample counts across categories. ... The TMWPL dataset is developed based on the TIMSS (Trends in International Mathematics and Science Study) assessment framework... The PMTD dataset captures one-on-one instructional interactions between teachers and students...
Dataset Splits	Yes	TMWPL. ...The test set includes 50 samples per cognitive dimension, with the remaining data used for training. ...PMTD. ...The test set includes 40 samples for each category, and the remaining data is used for training.
Hardware Specification	Yes	All experiments were conducted on a system with eight NVIDIA A100 GPUs (80GB each).
Software Dependencies	No	Our model training implements Lo RA-based PEFT, using the LLa MA Factory (Zheng et al., 2024) automated training and inference pipeline as a foundation. The paper mentions software (LLaMA Factory) and its foundational methods (LoRA-based PEFT) but does not provide specific version numbers for these software components.
Experiment Setup	No	The combined loss function integrates the TC loss with the standard task-specific loss: L = LTask + λLTC, where λ controls the influence of the TC regularization. ...For implementation, we utilized Lo RA to efficiently adapt the LLM weights W0 using trainable rank-decomposition matrices B Rd r and A Rr k. ...The matrix A is initialized as a Gaussian distribution with a mean of and the matrix B is initialized as a zero matrix. While some aspects like loss function parameters (λ) and initialization are mentioned, specific hyperparameter values like learning rate, batch size, number of epochs, and optimizer settings are not explicitly detailed in the main text.