Neighborhood-Aware Negative Sampling for Student Knowledge and Behavior Modeling
Authors: Siqian Zhao, Sherry Sahebi
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on two real-world datasets indicate the significant improvement of NANS-Ko Be M over the baseline methods in both student performance prediction and next question prediction tasks, the effectiveness of NANS compared to alternative negative sampling methods, in both efficiency and effectiveness, and the importance of modeling interrelations between student knowledge and behavior modeling. |
| Researcher Affiliation | Academia | Siqian Zhao, Sherry Sahebi Department of Computer Science, University at Albany SUNY EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and equations in detail, but it does not include a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | Our codes are available at https: //github.com/persai-lab/2025-NANSKo Be M. |
| Open Datasets | Yes | Ed Net (Choi et al. 2020b) is a publicly available anonymized dataset sourced from a multiplatform AI tutoring service, Santa1, designed to help Korean students prepare for the TOEIC2 English test. We use a preprocessed version of the dataset from previous studies (Zhao, Wang, and Sahebi 2022; Zhao and Sahebi 2023). Junyi (CMU Data Shop 2015; Pojen, Mingen, and Tzuyang 2020) is another publicly available and anonymized dataset from the Chinese e-learning platform Junyi Academy3, which is designed to teach children math. We use the preprocessed data introduced in (Chang, Hsu, and Chen 2015). |
| Dataset Splits | Yes | In accordance with established evaluation protocols for sequential methods, as detailed in previous studies (Piech et al. 2015; Wang, Zhao, and Sahebi 2021; Zhao, Wang, and Sahebi 2022), we implement a 5-fold student-stratified cross-validation to divide the data and report the mean experiment results across five folds for each method and perform a paired t-test comparing each baseline to the NANS-Ko Be M. For each fold, sequences from 80% of the students are designated as the training set, and sequences from the remaining 20% of the students are used as the testing set. Additionally, 20% of the training set is reserved for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory amounts, or detailed computer specifications used for running its experiments. It only mentions using PyTorch. |
| Software Dependencies | No | We use Py Torch4 to develop the NANS-Ko Be M. |
| Experiment Setup | Yes | A coarse-grained grid search is performed across all methods, including baselines, to identify the optimal hyperparameters. The best hyperparameters of our NANSKo Be M are reported in Table 2. Dataset dc dr dc vc df nc db db vb ds nb Ls N Ed Net 64 32 32 32 32 32 32 32 32 32 16 50 10 Junyi 32 32 32 64 64 32 32 32 32 16 16 100 10 |