reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Contrastive Cross-Course Knowledge Tracing via Concept Graph Guided Knowledge Transfer

Authors: Wenkang Han, Wang Lin, Liya Hu, Zhenlong Dai, Yiyun Zhou, Mengze Li, Zemin Liu, Chang Yao, Jingyuan Chen

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments. Extensive experiments on three cross-course knowledge tracing datasets, demonstrating the superiority of Trans KT over state-of-the-art baselines. We present the details of our experiment settings and the corresponding results in this section. We conduct comprehensive analyses and investigations to illustrate the effectiveness of proposed Trans KT model. Table 2: Performance comparison of Trans KT and 12 KT models on three datasets. Figure 5: Ablation study.
Researcher Affiliation	Academia	Wenkang Han, Wang Lin, Liya Hu, Zhenlong Dai, Yiyun Zhou, Mengze Li, Zemin Liu, Chang Yao, Jingyuan Chen* Zhejiang University EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in detail using text and mathematical formulations (e.g., equations 1-13) but does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	Our code and datasets are available at https://github.com/DQYZHWK/Trans KT/.
Open Datasets	Yes	We further process the publicly available PTADisc dataset [Hu et al., 2023] to obtain three sub-datasets specifically tailored to support the analysis of the CCKT task.
Dataset Splits	No	The paper mentions generating three sub-datasets from PTADisc and provides statistics in Table 1. It discusses variations in learning history length for specific experiments, but it does not specify explicit train/validation/test splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	No	The paper describes experimental settings like optimizer, embedding size, dropout rate, learning rate, and L2 coefficient, but does not specify any hardware details such as GPU models, CPU types, or memory used for the experiments.
Software Dependencies	No	The paper mentions using the Adam W optimizer and fine-tuning a smaller language model (LM) like RoBERTa based on the transformer structure. However, it does not provide specific version numbers for any programming languages, libraries, or other software components used in the implementation.
Experiment Setup	Yes	We use the Adam W optimizer to train all models, fixing the embedding size at 256 and the dropout rate at 0.3. The learning rate and L2 coefficient are chosen from the sets {1e-3, 1e-4, 1e-5} and {1e-4, 5e-5, 1e-5}, respectively. The hyperparameters η and λ are chosen from the range 0.1 to 0.9 with a step size of 0.1. We set an epoch limit of 200 and employ an early stopping strategy if the AUC shows no improvement for 10 consecutive epochs.