SecEmb: Sparsity-Aware Secure Federated Learning of On-Device Recommender System with Large Embedding
Authors: Peihua Mai, Youlong Ding, Ziyan Lyu, Minxin Du, Yan Pang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical analysis demonstrates that Sec Emb reduces both download and upload communication costs by up to 90x and decreases user-side computation time by up to 70x compared with secure Fed Rec protocols. Additionally, it offers non-negligible utility advantages compared with lossy message compression methods. ... 5. Experiment Evaluation We evaluate our Sec Emb on five public datasets: Movie Lens 100K (ML100K), Movie Lens 1M (ML1M), Movie-Lens 10M (ML10M), Movie Lens 25M (ML25M), and Yelp ... Figure 5. User computation time (in milliseconds) for secret shares generation during training phase. ... Table 3 presents the prediction accuracy and reduction ratio of communication cost. |
| Researcher Affiliation | Academia | 1National University of Singapore, Singapore 2Hebrew University of Jerusalem, Jerusalem, Israel 3NUS (Chongqing) Research Institute, Chongqing, China 4Hong Kong Polytechnic University, Hong Kong SAR, China. |
| Pseudocode | Yes | Algorithm 1 Pad Or Trunc Idx ... Algorithm 2 Pad Or Trunc Emb ... Algorithm 3 Sec Emb ... Algorithm 4 FSS.Convert Eval ... Algorithm 5 FSS.Convert Gen ... Algorithm 6 FSS.Path Eval ... Algorithm 7 FSS.Path Gen |
| Open Source Code | Yes | In this paper, we address the problem by proposing a sparsity-aware secure recommender systems with large embedding updates (Sec Emb). The implementation is available at https://github.com/NusIoraPrivacy/SecEmb. |
| Open Datasets | Yes | We evaluate our Sec Emb on five public datasets: Movie Lens 100K (ML100K), Movie Lens 1M (ML1M), Movie-Lens 10M (ML10M), Movie Lens 25M (ML25M), and Yelp (Harper & Konstan, 2015; Yelp, 2015). ... We extend our framework to sequential recommendation tasks... on the ML1M and Amazon datasets. Experimental settings are detailed in Appendix K.6. ... Amazon dataset (https://cseweb.ucsd.edu/~jmcauley/datasets/amazon/links.html). |
| Dataset Splits | Yes | Each dataset is divided into 80% training and 20% testing data. |
| Hardware Specification | No | The paper frequently mentions 'edge devices' and 'resource-constrained devices' and discusses computation time, but does not provide specific hardware details (e.g., specific CPU/GPU models, memory, or cloud instance types) for the experimental setup. Table 14 refers to 'Server computation cost' but does not specify the server hardware. |
| Software Dependencies | No | The paper mentions using 'Adaptive Moment Estimation (Adam) (Kingma, 2014) method' and 'FSS scheme' but does not specify version numbers for these or other key software components or libraries (e.g., Python, PyTorch, CUDA, etc.) used for implementation or experimentation. |
| Experiment Setup | Yes | Each dataset is divided into 80% training and 20% testing data. For all cases, the recommender system is trained for 200 epochs. Each user represents an individual client and 100 clients are selected in each iteration. The parameters are updated using Adaptive Moment Estimation (Adam) (Kingma, 2014) method. We use the combination of MSE and the regularization term as the loss function. The security parameter is set to λ = 128. Each experiment is run for four rounds and the average values are reported. Table 8 lists the specific hyperparameters for each dataset and model. |