ADFormer: Aggregation Differential Transformer for Passenger Demand Forecasting

Authors: Haichen Wang, Liu Yang, Xinyuan Zhang, Haomin Yu, Ming Li, Jilin Hu

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments conducted on taxi and bike datasets confirm the effectiveness and efficiency of our model, demonstrating its practical value. Extensive experiments conducted on three real-world datasets from two cities demonstrate that our proposed model, ADFormer, surpasses state-of-the-art baselines in forecasting accuracy while maintaining computational efficiency, highlighting its practical applicability.
Researcher Affiliation Collaboration Haichen Wang1 , Liu Yang1 , Xinyuan Zhang1 , Haomin Yu2 , Ming Li3 and Jilin Hu1,4, 1East China Normal University 2Aalborg University 3INSPUR Co.,Ltd 4KLATASDS-MOE EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes its methodology using mathematical formulations and textual explanations within sections like '3 Methodology', '3.1 Data Embedding', '3.2 Unified Spatial Attention', and '3.3 Systemic Temporal Attention', but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/ decisionintelligence/ADFormer.
Open Datasets Yes We evaluate our model on three widely used public datasets: NYC-Taxi1, NYC-Bike2, and Xi an-Taxi, which exhibit diverse urban structures and demand distributions. NYCTaxi/Bike. These two datasets record transportation activity across 263 boroughs (i.e., urban regions) in New York. ... 1https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page 2https://citibikenyc.com/system-data
Dataset Splits Yes For each dataset, we split the data into training, validation, and test sets in a 7:1:2 ratio.
Hardware Specification Yes We conduct experiments on an NVIDIA Ge Force RTX 3090 with 24GB of memory.
Software Dependencies Yes The experimental environment is configured as follows: the Python version is 3.10.0, the CUDA version is 12.1, and the Py Torch version is 2.5.1.
Experiment Setup Yes The Adam W optimizer is used in model training, with an initial learning rate of 1e-3, decaying to 1e-4. We explore hidden dimension {32, 64, 128}, the depth of encoder {4, 6, 8} and impact of number of spatial clusters and hierarchical levels in the Parameter Study.