Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

HPDM: A Hierarchical Popularity-aware Debiased Modeling Approach for Personalized News Recommender

Authors: Xiangfu He, Qiyao Peng, Minglai Shao, Hongtao Liu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two widely-used datasets, MIND and Adressa, demonstrate that our framework significantly outperforms existing baseline approaches in addressing both the long-tail distribution challenge. Our code is available at https://github.com/hexiangfu123/HPDM.
Researcher Affiliation Collaboration 1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Ximalaya Inc., Shang Hai, China 3School of New Media and Communication, Tianjin University, Tianjin, China 4Du Xiaoman Financial Technology, Beijing, China hexiangfu1012, qypeng, EMAIL, EMAIL
Pseudocode No The paper describes the methodology in text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/hexiangfu123/HPDM.
Open Datasets Yes We evaluate our model on two real-world benchmark datasets: MIND [Wu et al., 2020] and Adressa [Gulla et al., 2017].
Dataset Splits No The paper provides statistics for the MIND and Adressa datasets in Table 2, but it does not specify how these datasets were split into training, validation, or test sets. It does not mention exact percentages, sample counts, or refer to standard predefined splits for reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions Word2Vec and BERT for embeddings but does not specify the version numbers of these or any other software libraries, frameworks, or programming languages used for implementation.
Experiment Setup Yes We investigate two critical parameters in the framework: γ for scaling exposure rate in inverse probability weighting, and τ for adjusting the debiasing weight in the counterfactual framework. The optimal value of γ was observed around 0.1, striking a balance between mainstream and niche user interests, while τ achieved peak performance at approximately 0.12. Notably, in our experiments, we find that performance degrades at τ = 0, which could demonstrate the necessity of counterfactual debiasing in news recommendation. Hence, we carefully set the γ as 0.1 and the τ as 0.12.