Iterative Sparse Attention for Long-sequence Recommendation
Authors: Guanyu Lin, Jinwei Luo, Yinfeng Li, Chen Gao, Qun Luo, Depeng Jin
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two real-world datasets show the superiority of our proposed method against state-of-the-art baselines. In this section, we experiment on two real-world datasets and explore the answers to the research questions (RQs): RQ1: How does the proposed ISA outperform state-of-the-art sequential recommendation models? RQ2: What is the impact of our sparse attention components? Are the high-level components, Sparse Attention Layer and Iterative Attention Layer, effective? RQ3: Does the proposed ISA still outperform state-of-the-art sequential models when varying sequence lengths? |
| Researcher Affiliation | Collaboration | Guanyu Lin1,2, Jinwei Luo3, Yinfeng Li1 Chen Gao*1 Qun Luo4 Depeng Jin1 1BNRist, Tsinghua University 2Carnegie Mellon University 3Shenzhen University 4Tencent Inc. |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/tsinghua-fib-lab/ISA |
| Open Datasets | No | As for datasets, we conduct evaluations of recommendation performance on two large-scale datasets. The data statistics after 10-core filtering are as Table 2, where mean length is the average of sequence length for users. Table 2: Data statistics after 10-core setting filtering. Dataset: Taobao, Short Video |
| Dataset Splits | No | Input: Click item sequence Iu = (i1, i2, . . . , it), t 100 for user u. Output: Click probability of user u to target item it+1. As for datasets, we conduct evaluations of recommendation performance on two large-scale datasets. The data statistics after 10-core filtering are as Table 2. These describe the task setup and dataset characteristics, but not specific train/validation/test splits for the experiments. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running the experiments, such as GPU/CPU models, memory, or cloud computing specifications. |
| Software Dependencies | No | The paper does not provide specific details on software dependencies, such as library names with version numbers, used in the experiments. |
| Experiment Setup | No | The paper mentions hyperparameters like teleport probability α for Personalized Page Rank, window size w for window attention, and the number of random items r for random attention, as well as λ for regularization. However, it does not provide the specific values used for these hyperparameters or other system-level training configurations like learning rate, batch size, or optimizer settings. |