Vertical Federated Learning with Missing Features During Training and Inference
Authors: Pedro Valdeira, Shiqiang Wang, Yuejie Chi
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show improved performance of LASER-VFL over the baselines. Remarkably, this is the case even in the absence of missing features. For example, for CIFAR-100, we see an improvement in accuracy of 19.3% when each of four feature blocks is observed with a probability of 0.5 and of 9.5% when all features are observed. The code for this work is available at https://github.com/Valdeira/LASER-VFL. |
| Researcher Affiliation | Collaboration | Pedro Valdeira Carnegie Mellon University & Instituto Superior T ecnico EMAIL Shiqiang Wang IBM Research EMAIL Yuejie Chi Carnegie Mellon University EMAIL |
| Pseudocode | Yes | Algorithm 1: LASER-VFL Training Algorithm 2: LASER-VFL Inference |
| Open Source Code | Yes | The code for this work is available at https://github.com/Valdeira/LASER-VFL. |
| Open Datasets | Yes | HAPT (Reyes-Ortiz et al., 2016): a human activity recognition dataset. Credit (Yeh & Lien, 2009): a dataset for predicting credit card payment defaults. MIMIC-IV (Johnson et al., 2020): a time series dataset concerning healthcare data. CIFAR-10 & CIFAR-100 (Krizhevsky et al., 2009): two popular image datasets. |
| Dataset Splits | No | The paper mentions splitting samples into K=4 feature blocks for datasets, but does not provide specific train/test/validation dataset splits (e.g., percentages, exact counts, or citations to predefined splits for training/evaluation sets). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions models like MLPs, LSTM, and ResNets18, and refers to methods like Fed SVD, but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers, which are crucial for reproducibility. |
| Experiment Setup | No | The paper does not provide specific experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer configurations, which are essential hyperparameters for reproducing the results. |