reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

HPDM: A Hierarchical Popularity-aware Debiased Modeling Approach for Personalized News Recommender

Authors: Xiangfu He, Qiyao Peng, Minglai Shao, Hongtao Liu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on two widely-used datasets, MIND and Adressa, demonstrate that our framework significantly outperforms existing baseline approaches in addressing both the long-tail distribution challenge. Our code is available at https://github.com/hexiangfu123/HPDM.
Researcher Affiliation	Collaboration	1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Ximalaya Inc., Shang Hai, China 3School of New Media and Communication, Tianjin University, Tianjin, China 4Du Xiaoman Financial Technology, Beijing, China hexiangfu1012, qypeng, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/hexiangfu123/HPDM.
Open Datasets	Yes	We evaluate our model on two real-world benchmark datasets: MIND [Wu et al., 2020] and Adressa [Gulla et al., 2017].
Dataset Splits	No	The paper provides statistics for the MIND and Adressa datasets in Table 2, but it does not specify how these datasets were split into training, validation, or test sets. It does not mention exact percentages, sample counts, or refer to standard predefined splits for reproducibility.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions Word2Vec and BERT for embeddings but does not specify the version numbers of these or any other software libraries, frameworks, or programming languages used for implementation.
Experiment Setup	Yes	We investigate two critical parameters in the framework: γ for scaling exposure rate in inverse probability weighting, and τ for adjusting the debiasing weight in the counterfactual framework. The optimal value of γ was observed around 0.1, striking a balance between mainstream and niche user interests, while τ achieved peak performance at approximately 0.12. Notably, in our experiments, we find that performance degrades at τ = 0, which could demonstrate the necessity of counterfactual debiasing in news recommendation. Hence, we carefully set the γ as 0.1 and the τ as 0.12.