Efficient Training of Neural Fractional-Order Differential Equation via Adjoint Backpropagation
Authors: Qiyu Kang, Xuhao Li, Kai Zhao, Wenjun Cui, Yanan Zhao, Weihua Deng, Wee Peng Tay
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the efficiency of our neural FDE solvers, experiments were conducted on three tasks: biological system FDE discovery, image classification, and graph node classification. The experiments in this section are designed to achieve two main objectives: 1) To verify that our adjoint FDE training accurately computes gradients and supports backpropagation. This is demonstrated in the small-scale problem described in Section 5.1, where the estimated parameters are shown to converge to the ground-truth values following the adjoint gradients. 2) To show that our adjoint FDE training is memory-efficient for large-scale problems. Experiments in Sections 5.2 and 5.3, particularly with the large-scale Ogbn-Products dataset, support this claim. |
| Researcher Affiliation | Academia | Qiyu Kang1, Xuhao Li 2, Kai Zhao3, Wenjun Cui4, Yanan Zhao3, Weihua Deng5, Wee Peng Tay3, 1 University of Science and Technology of China 2 Anhui University 3 Nanyang Technological University 4 Beijing Jiaotong University 5 Lanzhou University |
| Pseudocode | Yes | Algorithm 1: Reverse-mode Differentiation for a Neural FDE |
| Open Source Code | Yes | Code https://github.com/kangqiyu/torchfde |
| Open Datasets | Yes | This experiment evaluates the performance of neural FDEs on the MNIST dataset (Le Cun et al. 1998)... We follow the experimental setup from GRAND (Chamberlain et al. 2021), conducting experiments on homophilic datasets... For the Ogbn-products dataset, we employ a mini-batch training approach as outlined in the paper (Zeng et al. 2020). |
| Dataset Splits | Yes | We follow the experimental setup from GRAND (Chamberlain et al. 2021), conducting experiments on homophilic datasets. We adopt the same dataset splitting method as in (Chamberlain et al. 2021), using the Largest Connected Component (LCC) and performing random splits. For the Ogbn-products dataset, we employ a mini-batch training approach as outlined in the paper (Zeng et al. 2020). |
| Hardware Specification | Yes | The experiments were conducted on a workstation running Ubuntu 20.04.1, equipped with an AMD Ryzen Threadripper PRO 3975WX with 32 cores and an NVIDIA RTX A5000 GPU with 24GB of memory. |
| Software Dependencies | No | The paper mentions "Ubuntu 20.04.1" as the operating system and cites "TensorFlow (Abadi et al. 2016)" and "PyTorch (Paszke et al. 2019)" as leading platforms, but does not specify the version numbers of the libraries or solvers used in their implementation for the experiments. |
| Experiment Setup | Yes | The model uses the Adam optimizer (Kingma and Ba 2014) with a learning rate of 0.01. After 30 epochs, the estimated parameters, [0.99, 0.48, 1.05, 0.33], closely match the true values, demonstrating the efficiency of our adjoint backpropagation. ... The step size h is set to 0.1, and the fractional order β is set to 0.5. ... In both the training and testing phases, the batch size is set to 128. ... In our experiments, the batch size for adj-F-GRAND is set to 20,000 compared to 10,000 for F-GRAND when executed on the same GPU. |