Curriculum Abductive Learning for Mitigating Reasoning Shortcuts

Authors: Wen-Da Wei, Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct experiments to verify our claims and validate the superior performance of Cur ABL. In Subsection 5.1, we verify the clustering capability of the perception model of ABL after training on dataset MNIST-Additon. In Subsection 5.2, we describe the experimental setup. In Subsection 5.3, we evaluate the effectiveness of Cur ABL on two datasets, MNIST-Additon and Handwritten Formula Recognition, by comparing it with different ABL methods.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3School of Intelligence Science and Technology, Nanjing University, China EMAIL
Pseudocode Yes Algorithm 1 Cold Start Graph Construction Algorithm Input: Training set S = {(xi, yi)}N i=1, encoder E, threshold τ Parameter: Number of intermediate concepts k Output: k undirected graphs {G1, G2, . . . , Gk} Algorithm 2 Incorrect Pseudo-Label Removal Algorithm Input: k undirected graphs {G1, G2, . . . , Gk}, initial pseudo-label candidate sets {C1, C2, . . . , CN} for N samples Parameter: Number of intermediate concepts k Output: Updated pseudo-label candidate sets {C1, C2, . . . , CN} Algorithm 3 Cur ABL Training Overflow Input: Training set S = {(xi, yi)}N i=1, updated pseudo-label candidate sets {C1, C2, . . . , CN}, batch size B, number of epochs E Parameter: Model f, learning rate η Output: Trained ABL model f
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes Settings of MNIST-Addition The MNIST-Addition task [Manhaeve et al., 2018] takes two images of handwritten digits as input and outputs their sum. Settings of Handwritten Formula Recognition Additionally, we conduct experiments on the Handwritten Formula Recognition HWF task [Li et al., 2020].
Dataset Splits No The paper mentions the total number of samples for MNIST-Addition (30000) and HWF (10000) datasets, and describes how HWF-M and HWF-H were created by filtering based on complexity (e.g., 'removed samples where the cardinality of the candidate set of all pseudo-labels was less than 10'). However, it does not provide specific training/validation/test split percentages, sample counts, or explicit references to standard splits for the primary datasets to reproduce the data partitioning.
Hardware Specification Yes All experiments are implemented by Pytorch and are conducted on an NVIDIA RTX 3090 GPU.
Software Dependencies No All experiments are implemented by Pytorch and are conducted on an NVIDIA RTX 3090 GPU. The paper mentions PyTorch but does not provide a specific version number, nor does it list any other software dependencies with version numbers.
Experiment Setup Yes In the cold start method, we first train the ABL model on the dataset for six epochs and we set the threshold τ = 0.95. To ensure reliability of our results, all experiments are repeated five times with different random seeds.