Curriculum Abductive Learning for Mitigating Reasoning Shortcuts
Authors: Wen-Da Wei, Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct experiments to verify our claims and validate the superior performance of Cur ABL. In Subsection 5.1, we verify the clustering capability of the perception model of ABL after training on dataset MNIST-Additon. In Subsection 5.2, we describe the experimental setup. In Subsection 5.3, we evaluate the effectiveness of Cur ABL on two datasets, MNIST-Additon and Handwritten Formula Recognition, by comparing it with different ABL methods. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3School of Intelligence Science and Technology, Nanjing University, China EMAIL |
| Pseudocode | Yes | Algorithm 1 Cold Start Graph Construction Algorithm Input: Training set S = {(xi, yi)}N i=1, encoder E, threshold τ Parameter: Number of intermediate concepts k Output: k undirected graphs {G1, G2, . . . , Gk} Algorithm 2 Incorrect Pseudo-Label Removal Algorithm Input: k undirected graphs {G1, G2, . . . , Gk}, initial pseudo-label candidate sets {C1, C2, . . . , CN} for N samples Parameter: Number of intermediate concepts k Output: Updated pseudo-label candidate sets {C1, C2, . . . , CN} Algorithm 3 Cur ABL Training Overflow Input: Training set S = {(xi, yi)}N i=1, updated pseudo-label candidate sets {C1, C2, . . . , CN}, batch size B, number of epochs E Parameter: Model f, learning rate η Output: Trained ABL model f |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Settings of MNIST-Addition The MNIST-Addition task [Manhaeve et al., 2018] takes two images of handwritten digits as input and outputs their sum. Settings of Handwritten Formula Recognition Additionally, we conduct experiments on the Handwritten Formula Recognition HWF task [Li et al., 2020]. |
| Dataset Splits | No | The paper mentions the total number of samples for MNIST-Addition (30000) and HWF (10000) datasets, and describes how HWF-M and HWF-H were created by filtering based on complexity (e.g., 'removed samples where the cardinality of the candidate set of all pseudo-labels was less than 10'). However, it does not provide specific training/validation/test split percentages, sample counts, or explicit references to standard splits for the primary datasets to reproduce the data partitioning. |
| Hardware Specification | Yes | All experiments are implemented by Pytorch and are conducted on an NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | All experiments are implemented by Pytorch and are conducted on an NVIDIA RTX 3090 GPU. The paper mentions PyTorch but does not provide a specific version number, nor does it list any other software dependencies with version numbers. |
| Experiment Setup | Yes | In the cold start method, we first train the ABL model on the dataset for six epochs and we set the threshold τ = 0.95. To ensure reliability of our results, all experiments are repeated five times with different random seeds. |