Advancing Personalized Learning with Neural Collapse for Long-Tail Challenge
Authors: Hanglei Hu, Yingying Guo, Zhikang Chen, Sen Cui, Fei Wu, Kun Kuang, Min Zhang, Bo Jiang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that NCAL effectively enhances existing works, achieving new state-of-the-art performance. Additionally, NCAL mitigates class imbalance, significantly improving the model s generalization ability. Code is available at https://github.com/llm4edu/ NCAL_ICML2025.git. |
| Researcher Affiliation | Academia | 1East China Normal University 2Tsinghua University 3Zhejiang University. Correspondence to: Min Zhang <EMAIL>, Bo Jiang <EMAIL>. |
| Pseudocode | No | The paper describes methods and equations but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/llm4edu/ NCAL_ICML2025.git. |
| Open Datasets | No | Experiments are demonstrated on two personalized learning datasets. The TMWPL dataset is designed to evaluate students cognitive abilities, while the PMTD dataset focuses on classifying teacher-student dialogue behaviors. Examples from each dataset are shown in Figure 3. As illustrated in Figure 4, both datasets exhibit a characteristic long-tail distribution in sample counts across categories. ... The TMWPL dataset is developed based on the TIMSS (Trends in International Mathematics and Science Study) assessment framework... The PMTD dataset captures one-on-one instructional interactions between teachers and students... |
| Dataset Splits | Yes | TMWPL. ...The test set includes 50 samples per cognitive dimension, with the remaining data used for training. ...PMTD. ...The test set includes 40 samples for each category, and the remaining data is used for training. |
| Hardware Specification | Yes | All experiments were conducted on a system with eight NVIDIA A100 GPUs (80GB each). |
| Software Dependencies | No | Our model training implements Lo RA-based PEFT, using the LLa MA Factory (Zheng et al., 2024) automated training and inference pipeline as a foundation. The paper mentions software (LLaMA Factory) and its foundational methods (LoRA-based PEFT) but does not provide specific version numbers for these software components. |
| Experiment Setup | No | The combined loss function integrates the TC loss with the standard task-specific loss: L = LTask + λLTC, where λ controls the influence of the TC regularization. ...For implementation, we utilized Lo RA to efficiently adapt the LLM weights W0 using trainable rank-decomposition matrices B Rd r and A Rr k. ...The matrix A is initialized as a Gaussian distribution with a mean of and the matrix B is initialized as a zero matrix. While some aspects like loss function parameters (λ) and initialization are mentioned, specific hyperparameter values like learning rate, batch size, number of epochs, and optimizer settings are not explicitly detailed in the main text. |